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MENTAL TEST PERFORMANCE AS A FUNCTION OF 
PAYOFF CONDITIONS, ITEM DIFFICULTY, 
OF SPEEDING* 


AND DEGREE 


MOHAMMED 
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It is commonly recognized that changes in 
test directions and test conditions have im- 
portant influence on individual differences in 
test performance. All standardized tests in- 
clude directions as an inseparable part of the 
total test. These directions are meant to con- 
trol the individual’s motivation and mental 
set and keep them at more or less uniform 
level. Current experiments in decision making 
(Thrall, Coombs, & Davis, 1954) implicitly 
suggest that individuals do not always per- 
form a task or make decisions in accordance 
with the “rational” requirements of an ob- 
jective situation as defined by the experi- 
menter. Studies on response sets (Cronbach, 
1950) indicate wide individual differences 
with respect to the effect of test directions 
on'test behavior. 

One way of looking at test directions is to 
consider them as rules that govern and de- 
termine the process, product, and resultant 
payoff of the game. If the total payoff in a 
particular context depends upon a meticulous 
adherence to the rules of the game then, other 
things being equal, the more closely an indi- 
vidual’s performance follows the rules of the 
game, the better is his payoff. Conversely, dif- 


1 This article is based on the author’s PhD thesis 
submitted to the Graduate College of the University 
of Illinois in 1958. Special thanks are due to J 
Thomas Hastings and Lee J. Cronbach for their 
helpful criticisms and suggestions throughout the 
course of this investigation. The author is also in 
debted to the Bureau of Educational Research for 
financing part of this study. 

2 The experiment was carried out at the Univer- 
sity of Michigan, Ann Arbor, during 1957-58. The 
author deeply appreciates the help and encourage- 
ment provided by Clyde H. Coombs. 
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ferent payoff conditions can be operationally 
defined and manipulated and their effect on 
test performance can be experimentally ob- 
served and analyzed. 

In addition, there are a number of assump- 
tions implicit in different theories of intelli- 
gence (e.g., Thurstone, 1937; Furneaux, 1955) 
that have not been subjected to adequate ex- 
perimental test. One of these assumptions is 
that there is no interaction between such vari- 
ables as motivation, difficulty level, and time 
limit on the one hand, and intelligence on the 
other, i.e., persons who, on a test of mental 
ability, perform well under one condition will 
perform equally well under other test condi- 
tions within the limits of reliability. On a 
common sense level this proposition may seem 
to be sound but it cannot be justifiably ac- 
cepted unless experimentally verified. It is 
possible that some individuals improve their 
relative performance under certain conditions 
of testing while others deteriorate. 

This study was designed to investigate the 
uniform as well as differential response of in- 
dividuals to various test conditions. The gen- 
eral question under consideration was: Do 
the persons who perform well under one test 
condition perform equally well, within the 
limits of reliability, under other test condi- 
tions, whether the change in test conditions 
is due to variation in payoff* or item diffi- 
culty or time limit or a combination of two 
or more of them (the test content being the 
same)? The following questions were spe- 
cifically considered: 

3 The terms payoff and motivation, and time limit 
and degree of speeding have been used interchange- 
ably in this study. 
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TABLE 1 
MEAN p VALUES* AND Test-RETEST RELIABILITY 
ESTIMATES OF THE SUBTESTS 


TWENtTy-ITEM 


Subtests 


Form B 


Difficulty Level Form A Mean p 


High 


® The proportion of individuals in the standardization group 
of 94 passing a certain item is denoted as the p value of that item. 


1. Do the payoff conditions influence the 
mean test performance when the difficulty 
level and time limit are held constant? 

2. Is there any interaction between (a) per- 
sons and payoff conditions, (6) persons and 
difficulty levels, and (c) persons and time 
limits? 

3. Is there any interaction among (a) per- 
sons, payoff conditions, and difficulty levels, 
(b) payoff conditions, and time 
limits, and (c) persons, difficulty levels, and 
time limits? 

The answers to the above questions were 
sought not only from the analysis of rights 
scores but also from the analysis of wrongs 
and omissions scores in the hope that such a 
comparison would yield meaningful informa- 
tion. Regarding interaction effects, the pur- 
pose was (a) to test their statistical sig- 
nificance and (2) to attempt to describe and 
interpret them by means of appropriate tech- 
niques. 


persons, 


METHOD 
Subjects 


The Ss were 41 freshmen, sophomores, and juniors 
(27 males and 14 females) from an undergraduate 
psychology course at the University of Michigan.4 
Selection was based upon willingness to participate 
ifter a brief explanation of the experiment: 


I understand that you are taking a course in 
psychology this semester. I am sure you know that 
there is a course requirement of participating in 
sort of psychological experiment for 
the class time 


some 
hours outside 


two 


4Out of 50 students contacted, 41 participated in 
this experiment 


Ouereshi 


I am running an experiment next week and 
would like to explain to you what it is about. If 
you feel interested,’ I can give you further infor- 
mation about time and place of the experiment 
This experiment consists mainly of solving prob- 
lems made up of the English alphabet under dif 
ferent rules and conditions. The experiment will be 
conducted in two sessions of approximately an 
hour and a half each. The total time that you will 
have to devote is about three hours. Two of these 
three hours will count toward the fulfillment of 
your course requirement 

You are to decide whether or not you 
like to devote an extra hour or so. You will not 
be paid for this extra time. However, the experi 
ment is so set up that you will win at least three 
cents for each correct solution, and there are about 
200 problems. Let me also explain that under cer 
tain conditions you will lose three cents or more 
for each incorrect solution. In the past those who 
participated in this experiment were, in general, 
able to make some money. However, under no 
circumstances do you have to pay any money from 
your own pocket 


would 


Test Materials 


The Michill General Ability Test (Quereshi, 1958, 
pp. 53-68) consists of letter-series items employed 
first by Thurstone (1938) and later by Furneaux 
(1955). This test has two equivalent forms, each 
composed of 120 items, standardized on a group of 
94 freshmen, sophomores, and juniors at the Uni- 
versity of Michigan.5 Each form is divided into six 
subtests, three with an average p value of .29 and 
three with that of .77. Table 1 presents the » values 
and test-retest reliability estimates of the six equiva- 
lent 20-item subtests at the high and six at the low 
difficulty level. These subtests were administered to 
the 41 Ss under different conditions specified later 


Payoff Conditions 


There were three payoff conditions. Under the first 
condition each S was told that he would win three 
cents for each correct solution and that there would 
be no penalty for making errors. The second condi- 
tion allowed a gain of three cents per correct solu- 
tion but also involved the penalty of three cents for 
each incorrect solution. The gain under the third con- 
dition was the same as under the others but the pen- 
alty for each error was nine cents. Under the first 
condition, other things being equal, the Ss would be 
expected to make more errors and would omit very 


None of these 94 individuals participated in any 
other part of this study 
6A 14-page document containing the 12 twenty 
item subtests and Table A (residual correlations) has 
been deposited with the American Documentation In- 
stitute. Order document No. 6221, 
for 35-mm 52 .5C 


remitting $1.75 
microfilm or $ for 6X8 photo- 


copies 
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The second condition would be 
careless responses but not as much as 

which would considerably the 
number of omissions at the expense of errors as well 
is correct solutions 


items 
'O suppress 
Condition 3, 


iew expec ted 


boost 


Time Limits 


In order to study the influence of 
bined with other independent variables, time 
limits were included in the design. The Ss were either 
allowed 10 minutes (short time limit) of 20 minutes 
(long time limit) to work at a subtest of 20 items 
and timed by a and an electri 


spec ding com- 
two 


were stopwatch 


buzzer. 


Procedure 
‘ffects due to 
and 
randomly 
ided into 
with 11 
under all the 12 experi 
ff condi- 
sequence 


With a view to controlling the ord 
test materials, payoff conditions 
difficulty levels, all these variables were 

In addition, the 41 Ss were di 
groups of equal size 


members) and 


limits, 


combined 
excel yne 
were tested 


mental conditions (combinations of | 


tions, difficulty 


av 
The 
testing was ca! 
least 24 hr. apart, of 
arrangement in which the 
difficulty levels, and time 
and administered to Ss is 
In order to make the payoff conditions wel 
tood, a chart drawn on the blackboard 
the gain or per correct or incorrect 
solution. The particular condition of payoff 
which Ss were to work during a certain period was 
marked with an asterisk 


levels, and time limits) 
varied for different subgroups. The 


ried two sessions, at 


The 


conditions, 


out in 
1.5 hr 
payoff 
were 
Table 


under 


each tests, 
limits 
combined given in 
was 
loss 


indicating 


under 


rAB 


{OMB 


NING 


Instructions to the Ss 


In this test and all of the other tests in the se- 
ries you are going to take, each problem consists 
of a row of letters arranged according to a certain 
rule of its own. You are to discover the rule and 
provide the last letter of the row. Write this letter 
in block form in the space for answers 

On the first page you are given a few examples 
that you might like to solve for practice. Take th 
first example: DEF GHIJ. . In this case, 
the letters are in alphabetical order, so the answer 
is K. Write this letter in the 
Try to solve the remaining examples 
to ask any questions that you might 

On the second page will see the problems 
form the test itself. You will be required to 
these problems under different which 
vill be test. Be 
ireful to follow the according! 

The time allowed for each test will be announced 
of that 
As soon as y 
when you are told 


space for answers 
and feel free 
have 

you 
that 


solve 


rules 
} 


mentioned before the start of each 
rules and ; 


+1) 


before the start test buzzer will sound 


when the time is over 
immediately. Start 

Condition 1. Under this condition you win 
three cents for each correct solution, but you will 
You will have 10 


allowed in 


*1) 
will 


for wrong answers 
[whichever 


lose any 
or 20 minutes 
with Table 2] 

Condition 2. Now you will again win three cents 
for each but if your 
wrong you will lose three cents each. You will have 
10 or 20 minutes, ete 

Condition 3. Under this condition you will again 
win three cents for each right answer but you will 
nine cents for each wrong one. You will have 
10 or 20 minutes, etc 


not 


accordance 


correct answer, answer is 


19OSe 


INDEPENDENT VARIABLES 


wroup 3 


Group 4 
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TABLE 3 
MEANS AND STANDARD DEVIATIONS FOR VARIOUS EXPERIMENTAL CONDITIONS 
BASED ON RAw SCORES 


Rights 


Wrongs Omissions 
Experimental aaa 


Condition 


Payoff 


Condition Difficulty Time M SD 


Short 
Long 


High 
Low 
High 


Low 


Long 
Short 


—_— 


High 
Low 
High 
Low 


Short 
Long 
Long 


Short 


NwN NM Ww 


Short 
Long 


High 
Low 
High Long 


Short 


www iw 


Low 


Treatment of Data 


Each of the three scores (rights, wrongs, and omis 
sions) under each of the 12 experimental conditions 
was available for the 41 Ss. The data were first ana- 
lyzed by analysis of variance and then by factor 
analysis. Since most of the raw score distributions 
were highly skewed, they were subjected to a cube- 
root transformation that yielded fairly homogeneous 
distributions.? The analysis of variance was carried 
out on the transformed distributions of all three 
types of scores so as to study the changes in per- 
formance not only as indexed by the rights but also 
by the wrongs and omissions. Only two of these three 
scores, however, are statistically independent. 

The use of factor analysis was necessitated by the 
need to describe and interpret the sources of varia- 
tion, especially the statistically significant interaction 
effects involving persons. The reasons for employing 
the square root method, which in this case is the 
most appropriate one, and the criteria for determin- 
ing the pivotal variables are presented later. 


RESULTS AND DISCUSSION 
Changes in Test Means and Correlations 


Table 3 presents means and standard devia- 
tions of the raw score distributions of rights, 
wrongs, and omissions under various experi- 
mental conditions. It is apparent that both 
means and standard deviations of all types 
of scores vary considerably across the experi- 


7 The Bartlett test of the homogeneity of variance 
vielded the following x* estimates for 11 df: rights, 


13.35; wrongs, 12.93; and omissions, 10.12; while P 
for a x* of 13.35 for the given df > .25. 


mental conditions. For example, the means of 
rights scores range from 3.5 to 16.8, and the 
standard deviations range from 1.7 to 4.0. 

The main cognitive ability involved in the 
solution of letter-series items is inductive rea- 
soning. To solve such problems, an individual 
has to study the arrangement of letters com- 
prising an item, induce the rule, and then 
write down the next letter missing in the 
chain. This trait has been commonly consid- 
ered to be general for most individuals over 
a variety of situations. The experimental treat- 
ments designed in this study are only a limited 
sample of different situations under which one 
may have to take a mental test or carry out 
any other cognitive activity. Table 4 contains 
median correlations of different scores under 
various experimental conditions.* The median 
correlation of rights with rights in this table 
indicates the degree of generality of the induc- 
tive reasoning ability. The medians ranging 
between .31 and .56, with the median of me- 
dians being .47 (P < .01), support the view 
that inductive reasoning is a general trait. 
However, the magnitude of correlation has 
been considerably attenuated by the experi- 
mental variation of conditions. 

Other information in Table 4 shows that 


8 The complete matrices of correlations of differ- 
ent scores under various experimental conditions are 
presented elsewhere (Quereshi, 1958, pp. 31-33). 
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wrongs and omissions are associated with one 
another only to a negligible degree while rights 
and omissions have sizable intercorrelations. 
The overall correlation between rights and 
wrongs, and wrongs and omissions is approxi- 
mately zero, while rights and omissions are 
significantly associated with one another (the 
median of medians being —.34). Since only 
two of the three scores considered in this 
study are statistically independent, this rela- 
tively independent pair happens to consist of 
rights and wrongs scores. This fact has to be 
kept in view in interpreting the results of 
analysis of variance presented later. 

The median correlation of wrongs with 


wrongs (Table 4) is approximately the same 
as that of omissions with omissions but ap- 
preciably lower than that of rights with rights. 


Performance 69 
This points up a tendency of wrongs and 
omissions scores being somewhat more sus- 
ceptible to the influence of experimental varia- 
tion than rights scores—this is further clari- 
fied by the results of analysis of variance. 


Significance of Main Effects and Interactions 


Table 5 gives the df, sources of variation, 
and F ratios for the respective variance based 
on rights, wrongs, and omissions. The main 
concern here is to compare and contrast the 
results based on rights and wrorgs with much 
less emphasis on the results based on omis- 
sions. This is justified because rights and 
wrongs; as pointed out previously, are rela- 
tively independent of one another while rights 
and omissions are not. 


TABLE 4 


AN CORRELATION OF SCORES 


Payoff Condition 1 


Diff 


High Time Low Time 
Scores 
Correlated » Short Short Long 


Long 


. Rights on this 
with rights on 


all others 


. Rights on this 
with wrongs on 
all others 


. Rights on this 
with omissions 
on all others 


. Wrongs on this 
with wrongs on 
all others 


. Wrongs on this 
with omissions 


on all others 


. Omissions on 
this with 
omissions on 


all others 


*Pforanr 30 = .05 
> MM stands for median of media 
¢ Decimals omitted All coefficier 


UNDER VARIOUS 


Short 


EXPERIMENTAL CONDITIONS 


41) 


Payoff Condition 2 Payoff Conditior 


Difficulty Difficulty 


High Time Low Time High Time Lov 


Short Long Short 


Long 
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TABLE 5 


ANALYSIS OF VARIANCE 


BASED ON RIGHTS, WRONGs, 


AND OMISSIONS 


Mean Squares 


Source of Variatior Wrongs 
Main Effects 

os (Su ject 

P (Payoff) 

D (Difficulty) 

T (Time Limit) 
First-Order Interactior 


0.047 


0.035 


Ss KP 0.169 


SsX*D 0.259 
Ss X T 
rx D 
Px. 


0.035 0.271 
0.203 0.061 
0.040 


0.549 


0.020 
0.703 


0.063 
0.025 
0.091 


0.084 


Third-Order Interact 


05 
eP < 01 


eee P< OO! 


All of the main effects in Table 5 are sig- 
nificant. Of chief interest is the influence of 
payoff (motivation) on performance. The F 
ratio of 4.25 for rights is significant beyond 
the .05 level though not at .01. On the other 
hand, the influence of payoff on performance 
as indexed by wrongs and omissions is highly 
significant. The payoff conditions designed to 
manipulate motivation significantly influence 
the mental test performance of the individuals 
as a whole. 

Only the first-order interactions 
based on rights are significant. The interac- 
tion of payoff with difficulty is significant be- 
yond .05 but not at .01, while the interaction 
between difficulty and time limits is signifi- 
cant beyond the .01 level. Since these two in- 
teractions do not involve persons, their signifi- 
cance is of little import in the context of this 
study. The second order interactions based on 
rights, except one, are not significant at the 
05 level. 


two of 


Omissions ight Omi 


LISSLONS 


1.535 6.94*** 8.00*** 6.62*** 
28.69*** 
765.00*** 


219.09*** 


495* 36.27*** 
1435.07*** 142.91*** 


139.99*** 121.91*** 


6.656 
177.480 
50.830 


0.364 


0.469 2.04** 2.02** 


0.220 0.95 
0.146 0.63 
0.553 31 2.38 


6.064 26.14*** 


The variance estimates based on wrongs 
contain some important information not ob- 
tainable from those based on rights. Among 
the main effects based on wrongs the influ- 
ence of payoff on performance becomes more 
significant than in the case of rights, while all 
other variation retain their high 
level of significance. The proportion of vari- 
ance attributable to difficulty, however, has 
been reduced considerably. 

An examination of the first-order interac- 
tions points up a few other distinctive fea- 
tures of the analysis of variance based on 
wrongs in contrast to that based on rights 
The interactions between persons and diffi- 


sources of 


culty, and persons and time limits are sig- 
nificant beyond the .01 level in the case of 
wrongs in contradistinction with the same 
based on rights. The interaction between per- 
sons and payoff conditions is significant at the 
.05 level when omissions are considered, indi- 
cating that motivation, in addition to univer- 
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sally influencing the performance of a group, 
has a differential effect on individuals, i.e., al- 
though all individuals are affected by. changes 
in motivational conditions, some are more sus- 
ceptible to such an influence than the others. 
It is reasonable to expect that factors other 
than mental ability in the traditional sense 
are also responsible for individuals’ reactions 
to a testing situation. Studies on response sets 
(Cronbach, 1950; Ziller, 1957) have thrown 
some light on this aspect of individual differ- 
ences, but further investigations are needed to 
fully comprehend the genotypic dimensions of 
such behavior. Further discussion on this issue 
will follow in connection with the results of 
factor analysis. 

Among the second-order interactions based 
on wrongs, three are highly significant. The 


first and the most significant of these is the 
interaction among persons, payoff conditions, 
and difficulty of the test material. This means 
that in the case of wrongs, motivation in com- 
bination with the difficulty level produces a 
unique effect on individuals’ behavior. Similar 
conclusions can be drawn regarding the other 
interactions. 

The results based on omissions in Table 5 
provide some new information in terms of the 
significance at the .05 level of the first-order 
interaction between persons and payoff con- 
ditions.’ Other results of the analysis of omis- 
sions do not yield any new insight. 

The results of analysis of variance provide 
affirmative answers to the questions raised in 
connection wiih the uniform as well as dif- 
ferential response of individuals to various 


TABLE 6 


FACTOE 


Symbol Rep 


under Payoff 1 


Sum of omissions 
der 


minus sum of omissions unde 


Payoff 3 


Sum of all wrongs under high 


1 


difficulty minus sum of all wrongs 


under low difficulty 


Sum of all under long time minus 
sum of all under short time 
Sum of high difficulty under payoff 
1 and low difficulty under Payoff 3 
minus sum of low difficulty under 
Payoff 1 and high difficulty under 
Payoff 3 


Sum of short time under Pavoff 1 and 
long time under Payoff 3 minus sum 
of long time under Payoff 1 and 


r Payoff 3 


short time unde 


1 difficulty under 
low difficulty 


short time an 
under long time minus sum of all 


low difficulty under short time 


and all high difficulty under long 


time 


* Variables C thro 


Persons X Difficulty 


Persons X Time 


Persons X Payofi X Diffi 


1 
culty 


Persons X Payoif X Time Diff 


Persons X Difficulty X 


ANALYsis BAsEep ON 43 X 43 CORRELATION M 


Persons X Pavofi 


Differential response to 


difficulty 


Differential response to 


speeding 
Differential 
to joint chang 


payoff and 


erential response 


to joint changes ir 


Differential response 


me to joint changes in 


aificulty and speeding 
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TABLE 7 
INTERCORRELATIONS OF THE SEVEN 
PrvoTaL VARIABLES 


Variable A ; D 


—i1 


Note. -N = 41, and P for anr of +.30 = .05 
* All coefficients are reported up to two decimal points 


experimental conditions. However, without an 
adequate description and conceptual clarifi- 
cation of these effects the matter still seems 
to be shrouded in mystery. For example, 
granted that the interaction between persons 
and payoff conditions in terms of omissions 
scores is significant, the question concerning 
the cause of such interaction still remains un- 
answered. 


Description and Interpretation of Effects In- 
volving Persons 


The method of factor analysis employed for 
describing and interpreting the statistically 
significant effects involving persons as re- 
ported here is different from routine factor- 
analytic devices since the problem, because 
of its nature, required special treatment. Ex- 
ploratory investigation showed that this could 
best be achieved by employing the square root 
method, using as hypothesized factors seven 
effects involving persons, which proved sig- 
nificant in analysis of variance. It was also 
decided that the first general factor should be 
based on rights, the second on omissions, and 
all the remaining five factors should be de- 
fined in terms of wrongs scores, since it was in 
the corresponding analyses that these effects 
appeared most significantly (see Table 5). 
Further, these seven postulated factors should 
be extracted from the grand correlation matrix 
of 43 variables in which the seven hypothe- 
sized factors were included as the seven piv- 
otal variables. The other 36 variables in this 
matrix would be 12 variables based on rights, 
12 based on wrongs, and the remaining 12 


based on omissions (Quereshi, 1958, pp. 31- 
33). Table 6 presents the seven pivotal vari- 
ables (hypothesized factors), the procedure 
by which they were defined, their place in the 
analysis of variance, and their psychological 
meaning. 

In order to illustrate the procedure of de- 
fining the pivotal variables, let us take Vari- 
able B in Table 6. This variable represents the 
differences among persons due to their char- 
acteristic response to payoff manipulation 
while other conditions such as difficulty and 
time limit are held constant, and hence could 
be formed by subtracting the sum of scores 
under one payoff from the sum of scores un- 
der another. Since Payoff 2 is an intermediate 
case, the difference between scores under Pay- 
offs 1 and 3 should virtually give the same re- 
sults as those obtained by substituting 2 for 
3. Hence, Variable B is based on the sum of 
omissions under Payoff 1 minus the sum of 
omissions under Payoff 3. 

Table 7 presents the intercorrelations of the 
seven pivotal variables labeled alphabetically. 
These coefficients indicate that except for 
Variable C, which is significantly associated 
with Variables A, E, and G (r, being .42, .42, 
and .35 respectively), all other variables are 
relatively independent of each other. 

The results of the square root factor analy- 
sis based on the 43 < 43 correlation matrix, 
with unities in the diagonal, are given in 
Table 8 which shows the factor loadings of 
various experimental conditions on the seven 
extracted factors.’ Table 8 also presents the 
Kuder-Richardson (1937) Formula No. 21 
reliability estimates for the unspeeded condi- 
tions. These reliabilities may be taken into 
consideration in evaluating the results of fac- 
tor analysis presented here. 

The first factor in Table 8 virtually ex- 
hausts among rights 
scores as indicated by extremely low loadings 
of rights on all other factors.1° Thus, the in- 


the common variance 


9 Factors V, VI, and VII have been redefined and 
interpreted since they were not properly defined in 


the original study (Quereshi, 1958, pp. 40-41). Ac- 
knowledgment is due to Lloyd G. Humphreys for 
bringing this matter to the author’s attention 

10In interpreting any of these factors only the 
variables with a loading of +.30 or above have heen 
considered. 
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rABLE 8 


FACTOR LOADINGS OF VARIOUS VARIABLES AFTER SQUARE ROOT ANALYSIS 


Factors 


- - Reliability 
Variables IV ‘ ! { K-R (21) 


00 
00 
00 


Rights 

P; DS» 
P, EL 
P, DL 
P, ES 
P2DS 
P, EL 
P: DL 
P2ES 
P;DS 
P, EL 
P;DL 
P;,ES 


Wrongs 
P,DS 
P, EL 
P; D L 
P,ES 
P2-DS 
P,EL 
PDL 
P2ES 
P;DS 
P, EL 
P,;DL 
P;ES 


Omission 
P,;DS 
P,EL 
P,; DL 
P,ES 
P,DS 
P,EL 
P,DL 
P2ES 
P;DS 
P; EL : 
P;DL —3 : 23 
P;ES 


weuannh Uv 


the explanation of 
> P;, Ps. or P; refers to P. 
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terpretation of the rest of the factors is mainly 
dictated by the exchange of loadings between 
wrongs and omissions. The fact that the load- 
ings on omissions are roughly reflections of 
those on wrongs of Factors II through VII 
has to be kept in view in interpreting these 
six factors. The residual correlations reported 
in Table A (see Footnote 6) are negligible 
considering that they do not follow any par- 
ticular pattern, and that the seven factors re- 
ported here account for 70% of the total vari- 
ance. The magnitude of the diagonal residuals 
in some cases appears to be large, but this is 
due to inserting unities in the diagonal of the 
analyzed correlation matrix. The percentage 
of variance attributable to each factor is as 
follows: 

Factor | 26 

Factor II 10 

Factor III 11 

Factor IV 

Factor V 

Factor VI 

Factor VII 6 


The first factor has very large positive load- 
ings on rights and large negative loadings on 
omissions. It accounts for 26% of the total 
variance, most of which represents the indi- 
viduals’ ability to solve letter-series problems 
accurately. This factor approximates induc- 
tive reasoning and is labeled as General Abil- 
ity. The magnitudes of loadings on rights sug- 
gest that, regardless of the payoff condition, 
this factor has largest loadings on those con- 
ditions which involve easy test material 
(mean p = .77). Within a given payoff, con- 
ditions involving high difficulty (mean p = 
.29) with either short or long time limit, both 
in the case of rights and omissions, have load- 
ings invariably lower than the condition which 
is equivalent on time but of low difficulty. 
This finding has considerable bearing on the 
current practice in group intelligence testing 
of allowing long time limit but setting the 
material at a high difficulty level. Whether 
the time is long or short General Ability as 
represented by performance on letter-series 
tests is measured more successfully by easy 
test material than by difficult test material. 
It is also indicated that when the test mate- 
rial is relatively easy little is gained by allow- 


ing generous time. A relatively short time 
limit will provide the same loadings on the 
General Ability factor as a long time limit. 

A highly intelligent and “perfectly logical” 
individual would omit no item under Payoff 
Condition 1. If an item were too difficult to 
solve within the given time, he would put 
down a guessed answer because there is no 
penalty for incorrect solutions. Under Payoff 
Conditions 2 and 3 he would reverse this 
strategy by minimizing guessing and omitting 
that item for which he was not fairly certain 
of the accuracy of his answer. If all Ss in this 
study had behaved like this hypothetical in- 
dividual, under Payoff Condition 1 Factor I 
would have had negative loadings on omis- 
sions and positive loadings on wrongs, while 
under the other two payoff conditions the 
loadings of wrongs would have been negative 
and of omissions, positive. The results show 
that omissions under Payoff Condition 1 do 
have negative loadings on Factor I; however, 
wrongs under the same payoff also have 
equally large negative loadings on this factor. 
Under Payoff Conditions 2 and 3 the ob- 
tained loadings are just the opposite of what 
was expected. Some interesting theoretical and 
practical questions arise here. 

In group intelligence and achievement tests 
scoring formulas are commonly used to penal- 
ize guessing by subtracting number wrong (or 
a proportion thereof) from number right. 
These formulas are designed to weight wrongs 
negatively on the assumption that such a pro- 
cedure yields better measures of the ability 
concerned. The analysis here indicates that the 
psychological meanings of wrongs and omis- 
sions under Payoff Condition 1 is not the same 
as under Payoff Conditions 2 and 3. A scoring 
formula of R — W or R — (W/n — 1) might 
be justifiably used for scores under Payoff 
Condition 1, but it is inappropriate for ad- 
justing scores under Payoff Conditions 2 and 
3. The shift in the psychological meaning of 
wrongs and omissions suggests that appro- 
priate changes in scoring formulas are neces- 
sary in order to keep the psychological mean- 
ing constant. 

Factor II in Table 8 has large positive load- 
ings on the first four conditions based on 
‘wrongs and moderate sized negative loadings 
on the remaining eight conditions also based 
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on wrongs. On the other hand, the first four 
conditions based on omissions have sizable 
negative loadings and seven of the remaining 
eight conditions based on omissions have 
small positive loadings. Under Payoff Condi- 
tion 1 there is no penalty for wrong answers. 
It is apparent that under such circumstances 
an adaptive and intelligent response is to 
guess and not to omit whenever one is un- 
certain or unable to produce the correct re- 
The pattern of loadings described 
above is indicative of such an adaptive re- 
sponse to changes in risk. Hence, this factor 
is named Adaptive Risk-Acce ptance. 

The loadings of variables based on wrongs 
on Factor III indicate that there is a tend- 
ency among some individuals to make care- 
less guesses (more errors) whenever the diffi- 


sponse. 


culty is high, regardless of the time limit or 
the payoff condition. At the same time the 
number of when the material is 
highly difficult, regardless of time or payoff, 
is considerably reduced (see the sizable nega- 
tive loadings of P; DS, P; DL, Pe DS, Ps 
D L, Ps D S, and Ps; D L based on omissions). 
Such behavior under Payoff Conditions 2 and 
3, when there is a penalty for wrong responses, 
represents a sort of recklessness. Therefore, 
this factor is called Brashnéss. 

Factor IV has loadings on variables based 
on wrongs considerably larger on those con- 
ditions where the time limit is long, than on 
those conditions where the time limit is short. 
The foregoing statement applies regardless of 
payoff conditions or difficulty levels. Factor 
IV distinguishes Ss who have a particularly 
strong tendency to give wrong answers espe- 
cially when they have ample time; i.e., it in- 
volves either lack of patience or unwillingness 


omissions 


to acknowledge failure to find an acceptable 
answer. They seem to be motivated 


rather 


to mark 
than discover solutions. It 
seems justifiable to call this factor Marking 


Compulsion. 


answers 


An inspection of Factor V indicates that the 
combined effect of highly difficult material 
and relatively short time on the error behav- 
ior of some Ss is quite analogous to the effect 
of introducing penalty for errors when the test 
material is easy regardless of the time limit. 
Under the conditions of no penalty for errors 


with highly difficult material and short time 
limit, willingness to guess or making more 
errors is adaptive and intelligent behavior. 
But unrestricted guessing (making more er- 
rors), when there is penalty for wrong an- 
swers, the material is easy, and the time is 
long, is clearly unadaptive. It characterizes an 
adventurous attitude on the part of some Ss 
which is curbed only when the penalty for 
wrong answers is combined with both highly 
difficult material and short time limit. The 
effect of difficult test material combined with 
only one of the other two variables (payoff 
and time limit) is not appreciable in terms of 
loadings of wrongs on this factor. It seems 
that some Ss continue putting down answers 
adventurously unless they are made to re- 
alize the disadvantages of such an approach 
by making the test situation completely un- 
rewarding for guessing. Such behavior is com- 
monly categorized as Bluffing and hence is an 
appropriate name for Factor V. 

The loadings of wrongs and omissions on 
Factor VI indicate that some individuals tend 
to omit items and avoid guessing (large nega- 
tive loading of P;:D S based on wrongs) 
when they should have been doing just other- 
wise. In addition, large number of errors un- 
der Ps D S based on wrongs is indicative of 
either total disregard for the conditions of 
reward and punishment as well as other cir- 
cumstances or utter confusion and perplexity. 
This is especially so when all conditions are 
unfavorable, i.e., the test material is highly 
difficult, the time is short, and the errors are 
penalized. It seems that some Ss under gen- 
erally unfavorable conditions of work tend to 
get panicky and become perplexed. Factor VI, 
therefore, represents Perplexity of this sort. 

The pattern of loadings of variables based 
on wrongs on Factor VII suggests that, re- 
gardless of the payoff, combining highly diffi- 


cult material with a long time limit or easy 
test material with a short time limit has the 


same effect on the performance of some Ss— 
the loadings of P; D L and P; ES, Pp DL 
and P. E S, and Ps D L and Ps E §S taken 
pairwise show a consistent trend. It means 
that for some persons a decrease in time is 


synonymous with an increase in difficulty; 


they would not speed up their pace even when 
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the material is easy and there is a definite 
possibility of getting more items correctly 
solved. The easygoing mode of response of 
some individuals leads to similar performance 
when either the difficulty is increased or time 
is decreased. However, when the time limit 
is long, as well as when the problems are easy, 
their performances evidence a sharp reduc- 
tion in errors. In the light of the foregoing, 
Factor VII is named Easygoingness. 

Factors II through VII will generally be 
classified as personality traits in the classical 
sense. However, a recent theoretical exposi- 
tion by Baldwin suggests that “the charac- 
teristics of a person which contribute to his 
carrying out adaptive behavior” (McClelland, 
Baldwin, Bronfenbrenner, & Strodtbeck, 1958, 
p. 210) can be considered an essential part of 
a fruitful definition of an ability construct. 
Conversely, characteristics that hamper a per- 
son’s adaptive behavior occupy an equally 
important position in any comprehensive con- 
ceptualization of ability. 

The variety of experimental circumstances 
employed in this study for investigating the 
changes in mental test performance provide 
meaningful description, analysis, and inter- 
pretation of what happens to individual adap- 
tive behavior when the environmental condi- 
tions are manipulated in certain ways. What- 
ever reservations one may have about the 
sample of Ss or the use of factor analysis in 
this study, certain conclusions seem to be 
amply justified. Contrary to empirical find- 
ings, one might have been logically justified 
in expecting that the General Ability factor 
would account for a preponderant portion of 
the total variance. Although Factor I accounts 
for 26% of the variance, Factors II through 
VII contribute almost twice as much variance 
as Factor I. Thus, the experimental manipula- 
tion of payoff conditions, difficulty, and speed- 
ing can introduce substantially important new 
factors of which the test constructor may or 
may not be fully aware. Test constructors 
who are especially interested in developing 
relatively pure measures of such attributes as 
inductive reasoning or abstract reasoning have 
to either develop stricter controls for obtain- 
ing homogeneous measures of those abilities 
or should be able to determine the precise in- 


fluence of other factors atid weight the score 
accordingly. The commonly held belief that, 
for all practical purposes, test directions pro- 
vide an adequate control of the Ss’ motivation 
and/or mental set is clearly untenable. 

The results of factor analysis have, in some 
respects, supplemented and clarified those of 
analysis of variance. Using them as supple- 
mentary devices, whenever suitable, seems to 
have some potential advantage over the ap- 
plication of either of them alone. 


SUMMARY 


This study was designed to investigate the 
influence of various payoff conditions both 
separately and in conjunction with variation 
in item difficulty and degree of speeding on 
mental test performance as measured by a 
test of letter-series items. The Ss were 41 col- 
lege freshmen, sophomores, and juniors. The 
data were analyzed both by analysis of vari- 
ance and factor analysis taking into consid- 
eration the number of (a) rights, (&) wrongs, 
and (c) omissions. The application of factor 
analysis was necessitated by the need to de- 
scribe and interpret the sources of variation 
found to be significant by analysis of variance. 

The results point out (@) the universal and 
differential effect of motivation (as defined 
and manipulated by various rules of payoff) 
on performance, (0) the differential suscepti- 
bility of individuals to test direction, (c) 
the optimal conditions for measuring General 
Ability as defined by performance on letter- 
series tests, (d) the shift in the meaning of 
wrongs and omissions due to variation in test- 
ing conditions, and (e) the relation between 
test performance and such personality charac- 
teristics as Adaptive Risk-Acceptance, Brash- 
ness, Marking Compulsion, Bluffing, Perplex- 
ity, and Easygoingness. The implications of 
these findings for the theory and practice of 
ability measurement have been considered in 
their proper perspective. 
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TO INTRAOCCUPATIONAL PROFICIENCY’ 
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THE PROBLEM 


A possible relationship between vocational 
interests and occupational proficiency has been 
indicated (Anastasi & Foley, 1949, p. 663: 
Carter, 1944, p. 9; Darley, 1941, p. 68; 
Fryer, 1931, pp. 217, 221, passin; Strong, 
1943, p. 506; Super, 1949, p. 435). Related 
have been summarized _ elsewhere 
1953, Ch. 2). Failure of previous 
studies has been attributed to either an in- 
sufficient and/or an 
quate vocational-success criterion. With few 


studies 
(Stone, 
number of cases inade- 
exceptions, such studies have been concerned 
with more than one occupation and, there- 
Intraoccupa- 
tional interest scales have appeared under a 
variety of dichotomous labels: 
ure, successful—unsuccessful, superior—inferior, 


fore, interoccupational scales. 


success—fail- 


good—poor, and others. 
Here, the objective is to differentiate sub- 
of a given oc- 


groups—superior and inferior 


cupation. The problem is: to construct an in- 
traoccupational interest scale and to cross- 


validate it. 
METHOD 
The Vocational-Success Criterion 
1 writers constitute the occupational group 
orthand skill 
objec tive criterion 


per se affords the use 
The ability to re- 
b standards and to read 
is the test. The crite 


which rel 


oken discourse at jc 

rion is limited to those 
r back.” Transcription 

is not required. The pr 

vork sample of occupationa 


The overall effectiveness 


training 


actual 


Pure shorthand skill characterizes and 
identifies the occupational group. The basic question, 
then, is: how do the vocational interests of the pro 
ficient subgroup of shorthand differ from 
those of the nonproficient subgroup? 

It is recognized that vocational interest keys have 
been standardized f 
workers on the basis of tl 


>} 


skill 


writers 


secretarial] 
Vocational In- 
study, 


for stenographic and 
Strong 
however, 
such keys 


terest Blank For purp of this 


use of 


The Sample 


The subjects were rteenth-ye female stu 


nts in attendanc: t of California’s publi 


junior The lents were enrolled in an ad 
the time of testing 


shorthand skill re 


vanced shorthand course, and at 
they d d the minimum 
( uired 

The subjects were grouped as follow 
The Cross-Validation Group (N = 100 


papers of th 


When the 
total group of Ss were arranged in 
ficiency criterion scores, ever 1ith 

regard to tied scores an 
I 


moved 

nen armarked for the cross 
group. This group was used in the valida 
process to test the 
A random san 


validation group wz 


mean differ 
Ss from this cross 
prediction phase of 


significance of 
pling ol 
used in the 
the validation process 
The Standardizatior When the 
removed from the 


Group (N = 1000) 
had been 
otal group, the standardization group remained. The 
standardi s the 


group and the 


ross-validation grout 
ation group wé source for the criterion 
primary group, as explained below 


Intercorrelations of the iterion, interest, and intelli 


gence variables were | on the data of the stand 
used in the construction 


ilso based 


ardization group. The 
linear regressio1 n 


lance oO! 


ide up of the tv subgroups: the st 
The grou 


<¢ 


lower 25 


ind the 


inferior 50 Ss 
upper 25% : the 
the criterion re distribution 


rion g1 


10th 


gard to tied scores and 


ion scores, every 


up. The pri- 
n process for 
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TABLE 1 


DISTRIBUTION OF RAW SCORES ON THE 


PROFICIENCY CRITERION FOR THE STANDARDIZATION GROUP 


BY SUPERIOR, INFERIOR, AND MippLE Groups 


Superior 


‘a 
40 
35-: 
30- 
25-29 
20-24 
15-19 
10-14 

5-9 

0-4 
N 

VU 
SD 


* Reliabilities are compu 


The Tests 


The test consisted of three tests, selected 
especially to measure the vocational-success, interest, 
ind intelligence variables. To insure uniformity of 
idministration, a special manual was prepared for 
participating shorthand instructors. With few excep- 
tions, the test battery was group administered 


battery 


The vocational criterion test was considered the 
best available instrument in keeping with the objec- 
tives of the study (Turse & Durost, 1941). The work 

ample is prepared in such a manner as to supply an 
idequate evaluation of shorthand proficiency with 
out the influence of irrelevant factors 
sists of five “typical business letters.” 


The test con- 
In the interest 
»f testing time only two of the five letters were used 
in connection with the criterion measure. Preliminary 
testing showed a high correlation of Letters I, II, and 
V with Letter III. Letter IV showed a low correla- 
tion with Letters I, II, II], and V. Accordingly, Let- 
ters III and IV were selected for use in the study 
Also, these two letters best represent the major areas 
into which common shorthand errors fall; namely, 
language skill, penmanship, and mastery of short- 
hand principles. Letters III and IV yield a maximum 
score of 73. One point was deducted for each of the 
following types of errors: grammar, spelling, punc- 
tuation, and context. 

Electronic-process answer sheets were used with 
the Interest Blank (Strong, 1946). When the criterion 
groups had been determined on the basis of the cri- 
terion performance, the inventory was scored sepa- 


Groups 


Middle Inferior 


250 
20.46 + .37 


58 + .2 


rately for the superior and inferior subgroups. The 
total number of L, I, and D responses was obtained, 
by means of an electronic computer, for each of the 
400 items. Later, the total algebraic score was ob- 
tained, also by means of electronic computation, for 
each S. The vocational interest key developed on the 
basis of response differentiation was employed. The 
total algebraic score for each subject was transmuted 
to a nonalgebraic score. 

The intelligence test selected served the design of 
the study by furnishing a general index of intellec 
tual ability (Otis, 1922) 


THE RESULTS 
The Criterion Group 


The criterion group is comprised of the 
upper and lower 25 percents of the criterion 
performance scores for the standardization 
group. Table 1 shows the distribution of raw 
scores on the proficiency criterion for this 
group. 

The two subtests used in the criterion in- 
strument (Letters III and IV) give a Pear- 
sonian product-moment correlation of .96, a 
split-half reliability coefficient, for the stand- 
ardization group. By application of the Spear- 
man-Brown formula, the coefficient is raised 
to .98 = .003. 
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Construction of the Superior—Inferior Inter- 

est Scale 

The Technique. The superior—inferior inter- 
est scale was constructed on the basis of the 
Ream-Cowdery-Strong scoring technique, us- 
ing Strong’s +4 percentage-difference chart 
for Kelley’s weighted phi coefficient formula 
(Strong, Ch. 23). The revised formula for 
semiequalized fourfold tables is 


W = 100 a 
“ 1 — 47 A? 


1— 3pq 

p*q’ 

The weights thus obtained are multiplied by 
100 for the purpose of dealing with only 
whole numbers. 

The superior group, or the upper 25% of 
the criterion distribution, was used as the 
positive reference point. 

The Superior—Inferior Interest Scale. A to- 
tal of 190 of the 400 items included in the 
Strong Vocational Interest Blank, or 47.5% 
of the inventory, differentiated the two groups. 

Reliability of the Superior—Inferior Inter- 
est Scale. The odd-even reliability coefficient 
was computed, correlating scores of one set 
of 95 items with scores of the second set of 
95 items. The Pearsonian product-moment 
correlation is .94. When corrected by use of 
the Spearman-Brown formula, however, the 
correlation is raised to .96 + .01. 

The superior—inferior interest scale was 
used for scoring the interest inventories for 
the cross-validation group and the standardi- 
zation group (NV = 1100). The distribution 
was found to be practically normal. The 
means of the superior and inferior subgroups 
of the criterion group were tested for signifi- 
cant differences. When the percentage of over- 
lapping was computed by Tilton’s (1937) 
formula, the superior and inferior subgroups 
were found to overlap 39% with a standard 
error of 3.25. 


a-c 


A=-— and 


2 _~ 


Validation of the Superior—Inferior Interest 
Scale 


Criterion and Predictor Critical Scores. In 
the validation process the superior Ss are con- 


sidered to be those who score above a pre- 
determined critical score on the criterion 
measure, while the inferior Ss are those who 
score below such score. With respect to the 
proficiency criterion, at least three possibili- 
ties for determining the critical score were 
available: (a) the mean of the proficiency 
criterion distribution, 34.11; (6) the weighted 
mean of the means for the two groups, 34.16; 
and (c) Guilford’s (1950) formula for ob- 
taining the critical value, 34.64. Although any 
one of the three scores would have sufficed, 
the weighted mean was employed. While this 
value was 34.16 for the criterion proficiency 
variable, it was 26.84 for the predictor (in- 
terest) variable. 

Matched Groups. It has previously been 
pointed out that the interest scale would be 
validated by use of the primary group (NV = 
100)—part of the standardization group— 
and the cross-validation group (N = 100). 
The primary group figured in the construc- 
tion of the interest scale, while the cross- 
validation group now enters the study for the 
first time. 

A correlation of +.09 was found for inter- 
ests (the variable on which the difference is 
to be tested) and intelligence (a related vari- 
able). A statistical control was accomplished 
by applying Wilks’ odM formula (1931) for 
matched samples, after matching the superior 
and inferior categories of the primary and 
control groups for selected statistics, based on 
the intelligence factor. Age and sex were con- 
trolled by the design of the study. 

The original constants and distribution sta- 
tistics on the intelligence factor do not differ 
greatly for the superior and the inferior cate- 
gories. In order to obtain more valid results 
when applying the test of significance to the 
interest means, however, the matching pro- 
cedure was carried out. Two cases were lost 
from the primary group and four cases were 
lost from the cross-validation group in the 
process of insuring comparable statistics. 

Tests of Significance. Table 2 shows the re- 
sults from the significance of difference tests 
for means and standard deviations, for the 
primary and cross-validation groups. 

Prediction. As an additional means of vali- 
dating the superior—inferior interest scale, re- 
gression equations were derived. The voca- 
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SIGNIFICANCE OF DIFFERENCI 


VM 


Superior 


Inferior 


Superior 


Tests FOR MEANS 
SUPERIOR AND INFERIOR CATEG( 


TABLE 2 
AND SDs ON THEI 
PRIMARY 


S-I INTEREST 
AND CROSS-VALIDATION GROUPS 


SCALE FOR THI 


YRIES OF THE 


du ody a 


Primary 
1.14 
10.85 


93 


Cross-Validation 
1.18 


Inferior 5 .93 


tional-success criterion, occupational profi- 
ciency, is the dependent variable; interest is 
the independent variable. 

In order to test the goodness of fit of the 
regression equations, 30 Ss from the cross- 
validation ‘group were selected by the ran- 
dom-number technique. This group has not 
figured in the data from whith the regression 
equations are constructed, since the regres- 
sion equations are based on the performance 
of the standardization group. Table 3 shows 
the product-moment intercorrelations for the 
criterion, interest, and intelligence variables 
for the standardization group. 

First-order partial correlations are shown in 
Table 4. 

For the prediction of the criterion (Y) from 
interests (X), the usual formula for the linear- 
regression equation was employed, giving 

Y’ = 440 X + 22.29 

For the prediction of interests (X) from 
the criterion (Y), the linear-regression equa- 
TABLE 3 


PropucT—-MOMENT INTERCORRELATIONS FOR THI 


CRITERION, INTEREST AND INTELLIGENCI 


VARIABLES FOR THE STANDARDIZATION 


Group 


Intelligence 


Intelligence x 
090 3 


212 


031 


030* 


Interests 


Criterion 


* Significant beyond the 


7.14 


tion was computed to be 


X’ = £845 Y — 1.96 


DISCUSSION 


Although the study deals with an occupa- 
tional group, the subjects selected are stu- 
dents. The nature of the proficiency criterion 

the work sample—is, however, the deter- 
mining factor. Since the criterion has been 
defined as skill performance, the employment 
status of the Ss did not interfere with their 
being classified on the basis of that defined 
criterion. Notwithstanding, a performance dif- 
ference is acknowledged when students are 
compared with gainful employees. That dif- 
ference, though, does not alter the nature of 
the problem. 

Previous studies which have dealt with this 
problem have been criticized chiefly on the 
basis of the vocational-success criterion se- 
lected (Craig, 1925; Freyd, 1926; Ream, 
1924; Ryan & Johnson, 1942; Strong, 1943, 


TABLE 4 


Frrst-OrDER PARTIAL INTERCORRELATIONS 
FOR THE CRITERION, INTEREST, AND 
INTELLIGENCE VARIABLES FOR 


THE STANDARDIZATION GROUP 


Intelligence Interests 


Intelligence x 
- 04 


Interests 3 
.204 


+ .032 x 
+ 


Criterion 030* 606 + .020* 


* Significant beyon > O01 level 
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pp. 508-509). Accordingly, this writer has 
tried to avoid such criticism. 

To the extent that theory and practice 
meet, to that extent will such an effort as put 
forth here prove practical. It is hoped that 
this study will help close the gap between 
theory and practice in the psychological area 
of vocational-interest measurement. 


SUMMARY 


A relationship was sought between voca- 
tional interests and occupational proficiency. 
The vocational-success criterion was defined 
as pure shorthand skill. The subjects were 
1100 female students enrolled in public 
junior colleges in California. The test battery 
consisted of an interest inventory, a short- 
hand proficiency test, and an intelligence test. 
A superior—inferior interest scale was con- 
structed on the basis of the Ream-Cowdery- 
Strong scoring technique, based on differentia- 
tion of responses. The data were subjected to 
statistical tests of significance and the linear- 
regression predictive technique for cross-vali- 
dation purposes. 

Interpretation of the data warrants the fol- 
lowing conclusions: 

1. An intraoccupational interest scale can 
be constructed for subgroups of a given occu- 
pation by use of a standardized interest inven- 
tory, on the basis of statistically significant 
differentiation in group response. 

2. Members of an occupational group can 
be classified on the basis of interests, with re- 
spect to quality of occupational performance. 

3. The intraoccupational interests scale can 
be validated by the use of statistical tech- 
niques, using primary and cross-validation 
groups. 
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LEGIBILITY OF MATHEMATICAL 


MILES A 


University of 


The legibility of mathematical and statisti- 
cal tables has been 
search is concerned. 


neglected as far as re- 
The great variation in 
typographical arrangements found in pub- 
lished tables in the field suggests that such 
factors amount of space 
taken, specific use for which the table 
was designed have led to arbitrary choice of 
the printing arrangement employed. In many 
tables, economy of space seems to be the sole 


convenience, 


as 


and 


consideration in determining the typographi- 
cal arrangement with little or no attention to 
legibility as such. In a previous study Tinker 
(1954) compared the legibility of five pub- 
lished tables. Since these tables were used 
just as published, the experimental design left 
much to be desired because of the 
variables involved. 


number of 
It seems desirable to in- 
vestigate the legibility of mathematical tables 
using an experimental design which permits 
control of the variables operating. In tables 
for general use, squares, cubes, square roots, 
This 
this 
The purpose is to study the effect of 
arrangement of numerals in col- 
umns, and space vs. space plus rules between 
columns on the speed with which the correct 
numbers 


the 


and cube roots seem to be much used. 
study is confined to tabular material of 
sort. 


type size, 


can be located. Accuracy of finding 
numbers was tabulated. 
few and scattered 


Since errors were 
no.use was made of the ac- 
curacy score. 
MATERIALS AND PROCEDUR! 
d pages 
sheet of 70-Ib. sulphide 
The four simulated 
blocks from left t right 


n pages The 


of powers and roots were 


on oF paper with 


surlace pages 


ex. 
with 1} in 


headings of columns 


90 000 


' rted ‘ 
ipported hy a 


1, University of Minnesota. 


This 1 irch was s! 


raduate Scho 


rrant 


TABLES 
TINKER 

Vinnesota 

There 
the four pages listed 200 numerals giving powers and 
roots for the numbers 300 to 499 
No. column were in bold 
roots in ordinary printing 
modern type face (E 
the same height. The same 
both sides of the sheet. There nine studies 
completed with 24 to 30 university students Ss in 
The total N was 246. 

All testing was done individually in the 
Licht Laboratory 
illumination 


each column 


were 50 numerals in 
Numerals in the 
and the powers and 
numerals were in a 
ie., all digits were 
tables were printed on 


face 
The 
xcelsior), 


were 


each study. 
Minnesota 
under 25 ft.-c. of well diffused 
The particular typographical arrange 
ment employed in each substudy is indicated with 
the table of results for that study 
On arrival at the laboratory, S was asked to ex- 
amine the four tables on the printed sheet, to note 
the No. column and the columns for the squares, 
cubes, square roots, and cube roots, and that 
were 50 numerals in each column, e.g., 300-349, etc 
He was then told his task was to find and cross out 
the squares (or cubes, or square roots, or cube roots 
as quickly and accurately as possible when a num- 
ber was presented to him. Two practice numerals 
were presented on one side of a 3” X 5” index card 
Then the card was turned over was 
while he looked up what was required for eight 
numerals. When working on each, he was asked to 
note columns (squares, etc.) were 
The stimulus numerals were systematically 
A different 
Orders of 
presentation as to squares, cubes, square roots, and 
cube were 


practice effects 


there 


and § timed 


where the for it 
located. 
scattered over the four pages of tables 
set of numerals was used for cubes, etc 
roots rotated to e 


systematically quate 


RESULTS 


Study 1. The results of Study 1 are pre 
ented in Table 1. Either 6- or 8-point type 
is ordinarily used in mathematical tables 
These type sizes were compared in this first 
study. 


hy 
i\ 


Numerais in columns were grouped 


fives and there was one pica (} inch) of 


As the 
the differences in time were always in 


space between columns. shown in 
table, 
fa of the 8-point type, but the differences 
not statistically 

t test. The? 

d. In the previous study by Tinker the 
comparison of 6- and 8-point was also equivo- 


significant as shown by 


test for related measures was 


cal, perhaps because of complication by other 


variables. 





Miles 


TABLE 1 


Stupy 1: Size or Type 


Comparison 
6-point 
Mean 
in Sec. 


8-point 
Mean 
in Sec. 


Differ- 
ence®* t 


0.4568 
0.7831 
0.8609 
0.6853 


+0.40 
+0.54 
+0.50 


+0.47 


28.27 
29.57 
27.83 
29.47 


27.87 
29.03 
27.33 
29.00 


Squares 
Cubes 

Sq. Roots 
Cube Roots 


Note.—Numerals were grouped by fives in columns with one 
pica space between columns. Form I was printed in 6-point 
type with 6-point leading between successive groups in a 
column; Form II was printed in 8-point type with 8-point 
leading between successive groups in a column. N = 30. 

= 29. 

* In all tables a plus difference means that the second score 
is smaller and therefore better. Original computations were 
carried to four decimal places. 


Study 2. The results are given in Table 2. 
The study was concerned with grouping of 
numerals in columns for 6-point type. Set 
solid (no grouping), grouping by tens, and 
grouping by fives were compared. Both 
grouping by tens and by fives were mcre 
effective than set solid, and grouping by fives 
tended to be better than by tens. This was 


especially so for finding squares and cube 
roots which were in the columns at the left 


Tinke) 


and at the right of the table. Note that in 
every instance grouping by fives was signifi- 
cantly better than set solid, while fives were 
significantly better than tens only for squares 
and cube roots. Apparently the advantage of 
grouping by fives over tens depends some- 
what on the position of the columns in the 
table. It would appear that grouping by fives 
in columns is the best arrangement when 6- 
point type is used. 

Study 3. The results are shown in Table 3. 
This study was concerned with grouping of 
numerals in columns for 8-point type. Effi- 
ciency of finding powers and roots was com- 
pared for set solid, grouping by tens and by 
fives. The results show that for squares and 
cube roots, grouping by fives was significantly 
better than set solid or grouping by tens. In 
the interior columns of the table (cubes and 
square roots) only the five grouping was sig- 
nificantly better than set solid for square 
roots. The rest of these differences were not 
significant. In general, some method of 
grouping for 8-point print tends to be su- 
perior to set solid, and in one instance, group- 
ing by fives is better than by tens. The slight 
superiority of the five grouping over the ten 
grouping tends to be the same as for 6-point 
type. 


TABLE 2 


Stupy 2: Grou 


Test 


Comparison “orms 


vs. II 
. Il 
vs. III 
vs. Il 
II 
vs. III 
vs. II 
ITI 
Tens vs. Fives vs. III 
Solid vs. Tens rs. I 
s. III 
. Il 


Squares Solid vs. Tens 


Squares Solid vs. Fives vs 
Tens vs. Fives 
Solid vs. Tens 


Solid vs. Fives 


Squares 
Cubes 

Cubes vs. 
Cubes Tens vs. Fives 
Sq. Roots 


Sq. Roots 


Solid vs. Tens 
Solid vs. Fives vs. 
Sq. Roots 

Cube Roots 
Cube Roots Solid vs. Fives 
Cube Roots Tens vs. Fives 


Note 
successive groups. = 


O1 level. 
.02 level 
05 level 


* Significant at 
** Significant at 
nificant at 


Materials were printed in 6-point type with one pica space between columns. 
columns, Form II had numerals grouped in tens, and Form III had numerals grouped in fives 
y 24 df 3 


PING OF NUMERALS IN COLUMNS 


Mean Scores 
in Seconds 
First Sec Difi 
28.21 


28.21 


28.50 
26.58 


0.29 


+-1.63 


3112 
5880** 
26.58 +1.92 5339** 
29.33 +-0.30 3790 
4-() 92 1228*** 
0.62 0.8072 
1.54 .6364 
2.71 6175** 
-1.17 .7500 
0.92 1200 
-3.17 5181* 
2.25 8818* 


28.50 
29.63 
29.63 28.71 
29.33 
31.92 
31.92 
30.38 
30.92 
30.92 
30.00 


28.71 
38 


NP = & we 


Form I had numerals set solid down 


here was a 6-pica space between 
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TABLE 3 


Stupy 3: 


Comparison 


Squares Solid vs. Tens 
Squares 


Squares 


Solid vs. Fives 

Tens vs. Fives 

Solid vs. Tens ‘s. IT 
Solid vs. Fives rs. III 
Tens vs. Fives s. Ill 
Solid vs. Tens II 
Solid vs. Fives il 
Fives rs. IIT 
Solid vs. Tens vs. II 
Solid vs. Fives III 
Tens vs. Fives s. III 


Cubes 
Cubes 
Cubes 

Sq. Roots 
Sq. Roots 
Sq. Roots 
Cube Roots 
Cube Roots 


Tens vs 


Cube Roots 


Note.— Materials were printed in 8-point type 
successive groups 
* Significant at the .01 level 


** Significant at the .02 level 
+9* Si at the .05 level 


gnificant 

Study 4. The data are presented in Table 4. 
Space vs. space plus rules between columns 
was compared for 6-point type set solid. No 
significant differences were found. 

Study 5. The results appear in Table 5. 
Space vs. space plus rules between columns 
was compared for 6-point type with numerals 
grouped by tens in the columns. The results 
are somewhat in conflict in that space appears 
best for cubes and rules (actually one pica 
space plus a rule) best for square roots. The 
other differences were not significant. 

Study 6. The results appear in Table 6. 


TABLE 4 


4: Space vs. RULES BETWEEN COLUMNS 


Comparisor 


Rule 
Mean 
in Sec 


space 
Mean 


in Sec 


0.6306 
0.8014 
1.9253 


30.00 -0.57 
32.93 
32.89 
32.21 


Squares 
0.65 
+1.22 


+-) 86 


Cubes 
q. Roots 


Cube Roots 33.07 


Note.— Materials were printed in 6-point type 
one pica space between columns, Form II had one 
plus a rule between columns. Numerals were 


columns, i.e., no grouping or extraleading. N = 


with one pica space between columns 
columns, Form II had numerals grouped in tens, and Form III had numerals grouped in fives. 
f = 24 df = 23 


GrouPING OF NUMERALS IN COLUMNS 


Mean Scores 
in Seconds 


First Second 
29.25 
29.25 
26.46 
29.83 
29.83 
28.50 
30.92 
30.92 
29.17 
29.75 
29.75 


31.04 


26.46 
27.08 
27.08 
28.50 
29.42 
29.42 
29.17 
28.83 
28.83 
31.04 
27.46 + 


27.46 +3.58 


2.8988* 
2.6854** 
0.8599 
1.5836 
0.5307 
0.8623 
1.6121 
2.1427*** 
0.3939 
1.3245 
2.8989* 
4.3660* 


Form I had numerals set solid down 
There was an 8-pica space between 


Space vs. space plus rules between columns 
for 6-point type with numerals grouped by 
fives in columns was compared. The space 
alone was significantly better for finding 
square roots and cube roots. It made no dif- 
ference with squares. 

Study 7. The results are given in Table 7. 
Space vs. space plus rules between columns 
for 8-point type set solid was compared. No 
significant differences appear. 


TABLE 5 


Stupy 5: Space vs. RULES BETWEEN COLUMNS 


Comparison 


Rule 
Mean 
in Se 


Space 
Mean 


In Sec 


Squares 30.29 89 : 0.4309 


29. 
31.96 33.39 
33.86 


31.00 


Cubes 
oq Roots 
Cube Roots 


2.0795** 
4.6979* 
0.4156 


30.68 


30.68 


Note.— Materials were printed in 6-point type. Form I had 

form II had one pica space 

plus a rule between columns ved by tens 

in columns with a 6-point space between successive groups 
N = 28. df = 27 


* Significant beyond the .01 level 
** Significant between the .02 and .05 levels, 





TABLE 6 


Stupy 6: Space vs. RULES BETWEEN COLUMNS 


Comparison 


space 
Mean 


in Sec 


Rule 
Mean 


in Sec 


Differ 
ence 
9 96 


29.50 0.4958 


? 1825*** 


oquares 


0.46 
Cubes 30.71 32.57 1.86 
3? SO 


30.46 


Sa. Roots 


| +3.40 3.9048* 


‘ube Roots +1.89 2.6196** 


** Significant between 
: tt 


Study 8. The results appear in Table 8. 
Space vs. space plus rules between columns 
for 8-point tvpe with numerals grouped by 
tens was compared. Rules between columns 
better for squares and 
The other differences were not 


were significantly 
square roots. 
significant. 

Study 9. 
Space vs. 


The results are given in Table 9 
space plus rules for 8-point type 
grouped by five was compared. Rules be- 
tween columns led to significantly faster re- 
finding square and 
other differences were not 


sponses in roots cube 


roots. The sig- 


nificant. 

DISCUSSION 
The size of mathematical tables 
may be either 6-point or 8-point without ad- 


type in 


TABLE 7 


RULES BETWEEN COLU) 


0.3363 


7R3 


Tinker 


TABLE 8 


Stupy 8: Space vs. RULES BETWEEN COLUMNS 


Comparison 


Space Rule 
Mean Mean Differ 


In Sec ence 


29.61 on 
31.14 57 


Sq Roots 33.71 é + 1.53 


Squares 2.2024* 
0.9693 
2.4022* 


0.0413 


Cubes 


Cube Root 30.46 0.03 


versely affecting the speed with which powers 
and roots are found, provided there is a 
grouping of numerals by fives down the col- 
umns. Although powers and roots are found 
slightly faster (1.4 to 1.8%) when tables are 
printed in 8-point type, the differences might 
well have occurred by chance. 

Apparently the most favorable factor in de- 
termining quick finding of powers and roots 
in tables is the grouping of numerals in the 
columns. For both 6- and 8-point type, 
grouping numerals in columns by fives or 
tens tends on the average to produce quicker 
finding of powers and roots. Furthermore, 
the grouping by fives appears better than by 
tens for both the 6- and the 8-point type. 

The results are not unequivocal with re- 
gard to the use of space or space plus a rule 


TABLE 9 


Stupy 9: SPACE vs. RULES BETWEEN COLI 


Mea 
in se 
29 89 0.3398 


1.0951 


29. 50 (0) 39 


31.89 31.00 +-(). 89 
>s 


3().?1 3 3.9856* 


7.96 t 3.4224* 


2 


< / 
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between the columns of numerals. When the 
numerals are set solid in columns (no group- 
ing), it makes no difference whether there is 
one pica space or one pica space plus a rule 
between columns. This is true for both 6- 
and 8-point type. When the numerals were 
grouped, either by fives or tens, the picture 
was somewhat different. Of the 16 differences 
seven were significantly in favor of one pica 
space plus a rule between columns, two were 
significantly in favor of one pica space be- 
tween columns, and for the other seven it 
made no difference (differences not signifi- 
cant). It would appear, therefore, that there 
is a slight advantage for the space plus a 
rule. In general it would seem that either 
one pica space or one pica space plus a rule 
between columns is satisfactory for mathe- 
matical tables of powers and roots in either 
6- or 8-point type. 


SUMMARY 
1. The purpose of this experiment was to 
investigate the effect of type size, arrange- 
ment of numerals in columns, and space vs. 
space plus a rule between columns on the 
speed of locating powers and roots in mathe- 


matical tables. Five-column 
printed so that these 
studied, one at a time. 

2. All testing was done individually under 
25 ft-c of well diffused illumination. In the 
nine studies completed there were 24 to 30 Ss 
in each, a total of 246. 

3. When numerals are grouped by fives in 
columns, 6- and 8-point type were équally ef- 
fective in promoting location of powers and 
roots. 

4. Grouping numerals in columns by fives 
or tens tends to promote quick finding of 
powers and roots. In general, the grouping 
by fives tends to be more effective than group- 
ing by tens for both 6- and 8-point type. 

5. Apparently it makes little difference 
whether one pica space or one pica space plus 
a rule between columns is used in tables of 
powers and roots. 


tables were 
variables could be 


In this comparison the re- 
sults were not entirely unequivocal. 
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FOREMEN-—-WORKER ATTITUDE PATTERNS' 


RICHARD J. OBROCHTA 


Convair/San Diego, A Division of General Dynamics 


The role of the supervisor in affecting 
worker attitudes and behavior has been the 
subject of investigation in a number of re- 
search studies. Uhrbrock (1934), in his well 
known study, found that foremen generally 
had more favorable attitudes toward the com- 
pany than did worker groups. Independent 
studies by Super (1939) and Fairchilc (1940) 
showed a positive relationship between occu- 
pational skill or level and attitude: the higher 
the skill, the more favorable the attitude to- 
ward the company. Lawshe and Nagle (1953) 
reported a correlation of .86 between the 
worker’s perception of his supervisor’s behav- 
ior and production. 

The present article reports on a study 
of foremen-worker interpersonal relationships 
and the attitudes of the worker toward the 
company, job, union, and union leaders.” It 
asks two questions: 1. If a foreman and 
worker have favorable attitudes toward each 
other, will they tend to “share” attitudes to- 
ward the company, job, union, and union 
leaders? 2. If a foreman and worker have un- 
favorable attitudes toward each other, will 
they tend “not to share” attitudes toward the 
company, job, union, and union leaders? 


PROCEDURE 


Purcell interviewed 121 hourly workers and 19 
foremen to obtain their attitudes toward each other, 
the company, job, union, and union leadership 
Snedecor’s Table of Random Numbers (1939) was 
used to select the sample. Three variables were ap- 
plied: sex, race, and length of service (Short: 2-7 


1 This study is based on a thesis submitted in par- 
tial fulfillment for the degree of Master of Science 
to the Graduate School of Loyola University, Chi- 
cago. It was done as a special study of the Human 
Relations in the Meat Packing Industry Research 
Project and was financed by the Rockefeller Foun- 
dation, Swift and Company, Amalgamated Meat 
Cutters and Butchers Workmen, AFL-CIO, with 
the cooperation of the National Brotherhood Pack- 
ing House Workers (Independent) and the United 
Packing House Workers of America, AFL-CIO. All 
research under the direction T. V. Purcell 
Loyola University 

2 Not Convair at San Diego 


was of 


88 


years; Middle: 8-15 years; and Long: 16 years and 
over). All workers and foremen were employees of 
Swift and Company and most of the hourly workers 
were members of the National Brotherhood of Pack- 
ing House Workers, Local 12. While a nondirective 
approach was used in the interview, the interview 
content was directed to cover 25 topics pertaining to 
company, job, union, and union leadership. Each in- 
terview was recorded and later typed into manu- 
script form. The typed manuscripts are the source of 
data for this article. 

Only 65 of Purcell’s original sample of 121 hourly 
workers were used in this study. It was necessary to 
use foreman-worker “combinations” (where both the 
foreman’s and the worker’s attitudes toward each 
other, the company, job, union, and union leader- 
ship were obtained) since reciprocal attitudes were 
involved. All of the original data were not usable 
for several reasons: 1. Foremen could not express 
their attitudes on employees who were recently trans- 
ferred into their group because they (foremen) felt 
they did not know these workers well enough. 2. In 
some cases, the worker had expressed his attitude to- 
ward an assistant foreman rather than the foremen 
used in the sample. 3. Because of 1 and 2, additional 
cases had to be discarded to maintain the stratifica- 
tion required to resemble the plant population dis 
tribution. 

A rating scale of 1.0 to 5.0 was devised to measure 
the favorability or unfavorability of the foremen and 
worker attitudes. 1.0 = Very favorable, 2.0 = Favor- 
able, 3.0 = Neutral or indifferent, 4.0 = Unfavorable, 
5.0 = Very unfavorable. One-half steps, 1.5, 2.5, 3.5, 
and 4.5, were also used in rating the attitudes to ob- 
tain a finer measurement. 

A degree of difference table was used to measure 
the amount of “sharing” or “not sharing” of atti- 
tudes between foremen and hourly workers toward 
each other, the company, job, union, and union lead- 
ership. For example, if a foreman’s attitude toward 
the company was rated 2.0 (favorable) and the 
worker’s attitude toward the company was also rated 
2.0, the degree of difference was considered zero 
Each of the 65 foreman-worker combinations was 
worked out in this manner. The favorability or un- 
favorability of the foremen and the workers as a 
group was also obtained by totaling the attitude rat- 
ings, dividing by N and converting this into a per- 
centage for the group. 

The standard error? of the difference of the per- 


] 1 
2 Dp = «/pqa— — where 
VN, Ws 


of foremen and workers with favorable interpersonal 


equals the percentage 


attitudes who “share’’ attitudes toward the company, 


job, union, and union leadership; g equals 1 — p, and 





Foreman-W orker 


TABLE 1 
PERCENTAGE OF AGREEMENT BETWEEN AUTHOR’S AND 
RESEARCH MEMBER’sS ATTITUDE RATINGS 


Percentage of Agreement 
+ 1.0 


Hourly 
Workers 
Attitude Toward (N = 65 


Foremen 


N= 19 


Company 
Job 
Foremen 
Workers 


Union 


96.0 
93.8 
93.8 


100.9 
89.4 


90.0 
100.0 
100.0 


93.8 


Union leadership 87.7 


centages between two sample 
to determine whether or not foremen and workers 
with favorable interpersonal attitudes shared atti- 
tudes toward the company, job, union, and union 
leadership. Critical ratios were then obtained to de- 
termine the significance of the findings. The second 
question asked in this research, namely, “will fore- 
men and workers with unfavorable interpersonal 
attitudes tend not to share attitudes toward the 
company, job, union, and union leaders,” could not 
be answered because only four cases were found 
where this negative interpersonal relationship existed. 

The author’s ratings were compared with those 
made by another member of the research staff to 
their reliability. “Agreement” was defined as +1.0, 
based on the 1.0-5.0 scale. 


percentages was used 


RESULTS 


Reliability of Attitude Ratings 


The author’s and research associate’s rat- 
ings of the hourly workers’ attitudes toward 
the company agreed (1.0) in 63 of the 65 
ratings or 96%. Worker attitudes toward the 
job, foremen, and union agreed (1.0) in 61 
of the 65 hourly workers rated or 93.8%. The 
lowest agreement occurred in the ratings of 
the hourly workers’ attitudes toward the un- 
ion leadership: 57 out of 65 ratings or 87.7%. 
(See Table 1.) 

The reliability of the attitude ratings of 
the foremen attitudes was checked in the 
same manner. These results are also shown in 


rkers in each 


group N, equals the foremen-worke1 with 


mutually favorable interpersonal attitudes who also 


group 


“share.” Nz equals the remaining foremen-worker 


group 
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Table 1. The author’s and research associate’s 
ratings of the foremen’s attitudes toward the 
company, union, and union leadership agreed 
100% (1.0). The ratings of the foremen’s 
attitudes toward the hourly workers agreed 
in 90% of the cases while the ratings of the 
attitudes toward the job agreed 89.4%. 


Agreement of Foremen-Worker Attitudes 


Table 2 shows the extent that foremen and 
workers agree in their attitudes toward the 
company, job, each other, the union, and un- 
ion leadership. Foremen and hourly workers 
agreed most in their attitudes toward the 
company (86.9%) and least in their attitudes 
toward the union leadership (54.6%). 

Table 3 shows the results of comparing the 
attitude ratings of the foremen and workers 
as a group. The foremen group tended to be 
slightly “more” favorable toward the com- 
pany than the worker group was. Neither 
group showed any differences in favorability 
toward their jobs. 

Although the hourly worker group is gen- 
erally favorable toward the union leaders, the 
foremen group is even more favorable (33.8% 
more) toward the leaders. This is further sub- 
stantiated by the weighted mean rating of 2.4 
(favorable-indifferent) for the hourly work- 
ers’ attitudes toward the union leadership 
compared to 1.6 (favorable-very favorable) 
for the foremen’s attitude toward the union 
leadership. 

Table 3 also shows that the hourly worker 
group was more favorable (38.4% more) to- 
ward the union than the foremen group was 
The hourly workers were also more favorable 


rABLE 2 


PERCENTAGE OF AGREEMENT BETWEEN I 
AND WORKER ATTITUDES 


OREMEN 


Percentage of Agree 
ment (+ 1.0) of 
Foremen-Worker 


Attitude Toward Attitudes 





Company 

Job 

Each other 
Union 

Union leadership 


86.9 
69.2 
67.0 
78.4 
54.6 





Richard J. 


TABLE 3 
PERCENTAGE OF “MorE”’ FAVORABLE ATTITUDES OF 
FOREMEN GROUP AND HOURLY 
WorKER GROUP 


Percentage 
of “*More”’ 
Favorable 
Attitudes 
Company 10.7 
Job 0.0 
Each other 32.3 
38.4 
33.8 


More 
Favorable 


Attitude Toward: Group 


Foremen 


Workers 
Workers 


Foremen 


Union 


Union leadership 


(32.3% more) in their attitudes toward their 
foremen than the foremen were toward them. 


Relationship of Foremen-Worker Attitudes 


Table 4 shows the standard error of the 
difference of the percentage for the worker- 
foremen group who had favorable interper- 
sonal attitudes and shared in their attitudes 
toward the company, job, union, and union 
leadership; and the foremen-worker group 
who comprised the remainder of the sample. 


Critical ratios for these two groups are also 
shown. No significant difference was found in 


the “sharing” of attitudes by the foremen- 
worker group who had favorable interper- 
sonal attitudes when compared to the re- 
maining foremen-worker group. 


DISCUSSION 


As pointed out before, the hourly-workers 
were favorable toward their local union lead- 
ers, however, the foremen as a group tended 
to be more favorable toward the union lead- 
ers than the hourly-worker group was. The 
minority of hourly-workers who expressed 
less favorable attitudes toward the union 
leadership seemed to have four basic reasons 
for these negative attitudes. The workers felt: 
(a) The union leaders had been in office too 
long. (6) The leaders did not do enough for 
all the workers. (c) The leaders “got their 
orders” from the company. (d) Some of the 
leaders were racially prejudiced. 

Conversely, the favorable attitudes toward 
the union leadership which were expressed by 
the foremen seemed to center around three 
basic reasons: (a) The union leaders did lead 


Obrochta 


the workers and made the foremen’s jobs 
easier by ironing out problems. (b) The un- 
ion leaders were broadminded and handled 
cases with good decisions. (c) The leaders 
watched over their group of workers and took 
good care of them. 

The hourly workers who expressed favor- 
able attitudes toward the union justified this 
with two reasons: (a) The union gave job 
security to the workers; and (0) the union 
served as the workers’ voice to management. 
It should be clarified here that, for this study, 
the term “union” meant a union more than it 
pertained to any particular local union. 

The findings on the foremen-worker inter- 
personal attitudes indicated that the worker 
was more favorable toward his foreman than 
the foreman was toward him. This is not too 
easily understood. On the surface, however, 
it seems two factors contribute to the work- 
ers’ more favorable attitudes toward the fore- 
men: (a) The criteria used by the foremen 
to judge favorableness or unfavorableness dif- 
fered from the criteria used by the workers. 
The foremen used specific tangible factors, 
such as quantity and quality of work, punc- 
tuality, skill, and job knowledge whereas the 
workers’ criteria consisted of such general 
factors as fairness, friendliness, and under- 
standing. (b) The higher authority status of 
the foremen tended to make them more criti- 
cal of their workers. 

No correlation was obtained between fore- 
men-worker interpersonal attitudes and their 
attitudes toward the company, job, union, 
and union leadership. The foremen and work- 
ers in the sample, although they may think 


TABLE 4 


COMPARISON OF “SHARING” OF ATTITUDES OF FOREMEN 
Worker Group Wo Hap FAVORABLE INTER 
PERSONAL ATTITUDES (.V = 40) witH ForeE- 
MEN-WORKEK GROUP REMAINING 
IN SAMPLE (N = 25 


Attitude Toward Dt 


Company 

Job 

Union 

Union leadership 
* 


Not significant at .05 level of confidence f freedom 


» 
\ 





Foreman-Worker Attitude Patterns 91 


favorably of each other, still “think for them- 
selves’ on those factors relating to the com- 
pany, job, union, and union leaders. 


SUMMARY 


The research posed two questions: 1. If a 
foreman and a worker have favorable attitudes 
toward each other, will they tend to “share” 
attitudes toward the company, their jobs, the 
union, and union leadership? 2. If a foreman 
and a worker have unfavorable attitudes to- 
ward each other, will they tend ‘“‘not to share” 
attitudes toward the company, their jobs, the 
union, and union leadership? 
foreman-worker 
study. Their interviews were 
The typed inter 


Sixty-five combinations 
were selected for 
recorded and then typed. 
views were read, analyzed, and rated for atti- 
tude content on a scale of 1.0 to 5.0: 1.0 be- 
“very favorable” end of the scale 
being “very unfavorable.” 


ing at the 
ind 5 
most in 
company 


workers agreed 


toward the 


Foremen and 
their attitudes 


(86.9% ) 


Chey agreed least in their attitudes to- 


the union leadership (54.6%). 
more 1a- 


W ird 

Hourl -workers tended to. be 
vorable in their attitudes toward the un- 
ion than the foremen were. 


4. Hourly-workers tended to be more fa- 
vorable toward their foremen than the 
foremen were toward them. 

Foremen tended to be more favorable 
toward the company than the workers 
were. 
Foremen 
toward the 
hourly-workers were. 


favorable 
than the 


tended to be more 


union leaders 


No significant difference was found to in- 
that if a foreman and a worker like 
each other they will tend to share attitudes 
toward the company, job, union, and union 
leaders. The research question re- 
mained unanswered because of the few cas¢ 


dicate 


second 


found where unfavorable foreman-worker in 
terpersonal attitudes existed. 
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A NOTE ON THE RELATIONSHIP BETWEEN AGE 
AND SALES EFFECTIVENESS 


WAYNE K. KIRCHNER, CAROLYN S. McELWAIN, anp MARVIN D 


DUNNETTE 


Minnesota Mining and Manufacturing Company 


Good salesmen are present at all age lev- 
els. The “topnotcher” may be young, old, 
or middle-aged and examples of each are com- 
mon. Selling, of course, is not unique in this 
for the same is true of most professions. Age 
alone is not and should not be a determinant 
of whether or not a man can do a job. Stud- 
ies suggest, however, that peak performance 
in many professions occurs at certain ages 
more often than at others. For example, 
physicists tend to make their major contribu- 
tions at fairly young ages while astronomers 
usually do so at a much later age. The age 
of peak performance can vary from profes- 
sion to profession. 

With 


questions arise. Is 


these 
any relationship 
between sales effectiveness and chronological 
age for salesmen? Is there any particular 
age period related to peak effectiveness in 
sales work? If so, when should most sales- 
men expect to reach their peak? 


respect to selling, therefore, 


there 


Answers to 


these questions, of course, have some practical 


utility in selection, placement, training, and 
sales manpower development. The purpose 
of this note is to present findings from a re- 
cent study comparing sales effectiveness with 
chronological age for a large group of sales- 
men in an attempt to answer the above ques- 
tions. 


METHOD 


This study was an offshoot of a study of concur- 
rent test validity. In the major study, a large group 
of salesmen in a nationally known manufacturing 
concern, Minnesota Mining and Manufacturing Com- 
pany, volunteered to take a battery of tests that 
were used as an aid in screening sales applicants. 
Test results for this group were compared with judg- 
ments of sales effectiveness obtained from the sales- 
men’s various managers. These judgments of effec- 
tiveness were also compared with the ages of the 
salesmen who were rated. Each manager’s rankings 
of his men were converted into a stanine distribu- 
tion so that a numerical index of effectiveness (9, 
8, 7 and so on) was provided for each man in the 
study. The subjects for study comprised 539 sales- 
men from two older sales divisions, which meant 
that the comparison extended over a broad age 
range. The mean age of the sample group was 
34.07 years; the youngest salesman was 23 years old 
and the oldest salesman was 65 years old. 


RESULTS 


Primary results are shown in Tables 1 and 
2. As is readily seen in Table 1, ratings of 
sales effectiveness, as shown in mean stanine 
scores, increase with age up to age 40 and 
then start to decrease after that age. It is 
obvious that many older salesmen receive 
high ratings. Generally speaking, however, 
the trend is clear for this sample. Sales 
effectiveness seems highest around age 40 
with the age range of roughly 30-45 years 
appearing to be the “golden range” of selling. 


TABLE 1 


RELATIONSHIP BETWEEN CHRONOLOGICAL AGE AND MEAN SALES EFFECTIVENESS S 


Mean Effex 
SD l 
N 2 118 


tiveness Score 


1a stanine conver 


ORES FOR 539 SALESMEN® 


Age Group 


50 and 


up 


2.1 


108 


sion of rankings 





Age and Sales Effectiveness 


This probably follows common expectations. 
Younger salesmen starting their careers are 
rated lower in terms of effectiveness, but as 
they become experienced gain the status of 
successful salesmen. Ultimately a plateau is 
reached and then a “downhill trend” starts. 

The statistical significance of these data is 
shown in Table 2. As the F ratio shows, the 
probability of these mean differences occur- 
ring through chance is low (less than one 
chance in 1000). This is strong evidence 
that the trend shown here is a statistically 
reliable one. At the same time, however, a 
relatively low value of 0.26 was obtained for 
the correlation ratio (7) for this data. This 
shows that over 90% of the sales effective- 
ness variance remains unassociated with age. 
It goes without saying, therefore, that infer- 
ences about sales effectiveness based on 
knowledge of age alone would be unwise and 
could be substantially “off base.” 


DISCUSSION AND CONCLUSIONS 


Little discussion is needed. These results 


merely indicate a definite trend in sales effec- 


Possible reasons 
Lack of motiva- 
tion after many years on the job could be a 


tiveness within one firm. 


for the decline are many. 


TABLE 2 


ANALYSIS OF VARIANCE RESULTS FOR RELATIONSHIP 
BETWEEN CHRONOLOGICAL AGE AND MEAN 
EFFECTIVNESS SCORES 

Source SS 
129.39 
1729.59 


Between 
Within 


Total 1858.98 


*P < 001. 


strong factor. Actual physical deterioration 
with “slowing up” is another strong possible 
reason. Promotion of the better persons to 
more responsible positions is a third possi- 
bility. There is no shortage of speculative 
reasons. Over-all, though, the major conclu- 
sion that can be reached is that sales effec- 
tiveness is related, on the average, to age, 
with a peak in effectiveness occurring around 
age 40. Older persons (over 45) may be less 
effective as a group in sales work than their 
younger co-workers. It cannot be overem- 
phasized, however, that great individual dif- 
ferences exist and that many older salesmen 
are ranked extremely high in sales effective- 
ness. 
(Received May 29, 1959) 
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Vi ant Lit » I , s] ae 
In a very brief unpublished paper, Gullik- 
sen, Saunders, and Tucker (1954) directed 
attention to results strongly suggesting the 
possibility of increasing many of our better 
multiple correlations with academic grades by 
.10 to .15. This was to be achieved through 
use of paper-and-pencil measures of consist- 
ency in making pair-comparison judgments. 
In two quite different groups of college-level 
students, with quite different content, format, 
and administration conditions for the pair- 
comparison schedules, they noted that, though 
the relationship was curvilinear, a linear 
correlation between consistency and average 
grades would have the .30 to .35 
range. The curve that the best 
grades were made by students who scored at 


been in 


was such 
about the 75th or 80th percentile of consist- 
ency. Above and below this point the prog- 
nosis was progressively poorer. The possibility 
of increasing multiple correlations with aca- 
demic grades stemmed from the fact that this 
high and 
“substantially 
relationships between consistency and 
ability as measured by the American Coun- 
cil on Education Psychological Examination 
(ACE) or the West 
The present study seeks similar indications 


relationship between consistency 
grades was accompanied by 


zero” 


Point Admissions Test 


on samples of law school students 
Consistency as a personality trait might, 


of course, be very important in high-level 
intellectual activities such as the study of law. 
An utter inconsistency of opinion and atti- 
i This study is part of a larger study supported b 
the Li Admission T¢ Policy 
The cooperation of the law schools 
ing this study possible is greatly 
grateful, also, for the advice and assistance of Led 
yard Tucker and Henrietta Gallagher. During the 
study the authors were taff of the Research 


of Educati 


aw School Committee 


which are mak- 


appreciated. We are 


on the s 


Division nal Testing Service 


Uedical Center, San 


O4 


Fra 


tude would make organized study exceedingly 
difficult, not to mention the possible ramifica- 
tions such a characteristic might have were it 
strongly present in an interpreter of law such 
as a Supreme Court Justice. On the other 
hand, too great an emphasis on consistency 
could close one’s mind to new possibilities, 
ideas, or conditions, thus producing 
which is to be avoided both in the 
and in professional practice. The measure- 
ment of consistency as a personality trait by 
means of observing one’s consistency in per- 


a sterility 
classroom 


forming a task is, of course, an example of a 
truly objective personality test in the sense in 
which Campbell (1957), Cattell (1957), Hills 
and Messick (1959), and others use the term. 


METHOD 


A person makes c¢ to the 
extent that he avoids judgments of the following na- 
ture: A is preferred to B, B is preferred to C, but C 
is preferred to A. Kendall (1955, p. 148) has shown 
how to ascertain the number of such circu- 
lar triads in a set of pair comparisons. In his formula 
for 


nsistent pair comparisons 


it is easy 


the number of circular triads, d, 
d 


where is the number of elements 


volved in the comparison pairs, and a; is the num- 


ber of times Element i preferred to each of 
In there were 21 
ments being compared with each other. Therefore 
1/12 n(n 1 is a constant, 1435, 
S and need not be used at all. Instead we merely sum 


the a,” values for each person and interpret this as a 


1S, 


say, 


f 
tl 


the other elements our case, ele- 


(2n 1 for each 


The higher his score the 
circular triads he uses and the more consistent 
his 


measure of his consistency 
fewer 
are responses 

Our instrument for obtaining pair comparisons 
the Legal Traits Test (LTT). This is the senior au- 
thor’s modification of the Military Traits Test used 
in the West Point study (Saunders, 1954) cited by 
Gulliksen, Saunders, and Tucker. It is composed of 
210 items, each of which asks S to decide whether, 
if his personality had to change, he would rather have 
more of one personality trait and less of a second or 


was 





Pair-Comparisons Consistency 


j TABLE 1 


INTERCORRELATIONS, MEANS, 


School 
Grades 
Law School Admission Test 


Mean 


o 
N 
* Signific 


ant at .05 level 


** Significant at .01 level 

The follow- 
ing 21 “personality traits” were presented in all pos- 
sible pairs: polished, popular, sociable, trustful, as- 
sertive, cooperative, loyal, helpful, imaginative, in- 
telligent, persevering, competitive, thoughtful, strict, 
alert, conventional, foresighted, sense of humor, ath- 
letic ability, memory, and personality. The pairs were 
placed in an optimum order (Ross, 1934); Ss were 
given 50 minutes to complete the schedule 


more of the second and less of the first 


SUBJECTS 


The Ss were first-year students during the 
fall of 1956 at eight law schools. In each 
school Ss who took the Legal Traits Test were 
a random selection from the entire entering 
class; the remaining Ss took other tests dur- 
ing this period. Six of the institutions were 
private universities; the other two were state- 
supported universities. 


CRITERION 


In our study the criterion was average first- 
year grade in law school. The reliability of 
these grades has not yet been directly evalu- 
ated. However, since the Law School Admis- 
sion Test (LSAT) generally had sizable cor- 
relations with them, in most cases the grades 
would seem to be at least moderately reliable. 
The correlations between first-year average 


and second-year average grade are generally 
quite high. 


RESULTS 


The intercorrelations between consistency 
scores, LSAT scores, and first-year grades, 
and the means and standard deviations of 
consistency scores at the eight schools appear 
in Table 1. It is clear from these data that 
among these law school students pair-com- 


STANDARD DEVIATIONS 


AND SAMPLE Sizes (N 


I 

18 

19 
2649 


160 
xO) 


parison consistency does not behave exactly 
as it did at West Point or Temple University, 
the sources of the Gulliksen-Saunders-Tucker 
data. Unlike the Gulliksen-Saunders-Tucker 
data, Table 1 shows no indications of con- 
sistency scores correlating significantly with 
grades. In six of the eight schools the ob- 
served relationship between consistency scores 
and grades is lower than that between con- 
sistency and LSAT scores. These data do not 
at all support for law schools the suggestion 
that consistency scores may be useful for in- 
creasing any of our multiple correlations with 
grades. Further, when scatterplots were ex- 
amined, there were no indications of curvi- 
linearity of regression of grades on consistency. 

One might suggest that the lack of correla- 
tions between our consistency scores and 
grades could be attributed to low reliability 
of our consistency scores. The Gulliksen- 
Saunders-Tucker report states that individu- 
als differ widely and reliably in their ability 
or tendency to make pair-comparison judg- 
ments that are free from circular triads. 
These authors concluded that the consistency 
score must be reliable because it did correlate 
with grades.* With our data we cannot use 
that basis for claiming reliability. However, 
our Legal Traits Test is only a minor modifi- 
cation (lengthening) of the Military Traits 
Test, the measure which provided the Gul- 
liksen-Saunders-Tucker consistency scores at 
West (1957, p. 23) 


found a test-retest reliability, over a six-week 


Point. Further, Davis 
interval, of .43 for a pair-comparison schedule 
composed of the 36 combinations of nine ele- 
ments. He concludes that the number of circu- 
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lar triads Ss make is reliable from test to re- 
test. Thus, there is support for the belief that 
the variance of consistency scores is not just 
error. 

If all of the law students were very con- 
sistent compared to the West Point and Tem- 
ple students, the consistency scores in our 
data would all come from the upper part of 
the distribution of consistency scores. This 
part of the distribution was essentially unre- 
lated (in terms of a straight line) to achieve- 
ment in the Gulliksen-Saunders-Tucker data. 
Our scores are negatively skewed, i.e., there is 
a preponderance of high consistency scores. 
The scores obtained by Gulliksen, Saunders, 
and Tucker were similarly negatively skewed.’ 
Thus, this does not seem to explain the dis- 
crepancy in our findings. 

One final possible explanation is that there 
is something different about our grades which 
accounts for the failure to find correlations 
between consistency and grades among law 
school students. Further study of military 
academy and college undergraduates may 
clarify this possibility. At any rate, it seems 
clear that consistency scores from the LTT 


are unlikely to be very useful as predictors 
of first-year law school average grades. 


SUMMARY 


In order to ascertain whether consistency in 
making pair-comparison judgments would be 
useful as a predictor of first-year law school 
grades, consistency scores from the Legal 
Traits Test were correlated with grades at 
eight schools of law. None of the linear cor- 


2D. R.. 


Saunders. Personal communication, 1958 


relations was significantly different from zero, 
and no tendencies toward curved relationships 
were evident in the scatterplots. Since these 
results are so different from the results re- 
ported by Gulliksen, Saunders, and Tucker, 
it is suggested that there are unknown but 
important differences between law school 
grades and college grades, or between law 
students and Temple or West Point students, 
which strongly influence the relationship be- 
tween consistency scores and grades. At any 
rate, these consistency scores are not promis- 
ing as predictors of law school grades. 
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The process of learning to receive Morse 
code has been of interest to psychologists since 
the pioneering work of Bryan and Harter 
(1897; 1899). For various interest 
has shifted from the form of the learning 
curve in code reception to the idéntification 
of those individuals who have greatest apti- 
tude for rapid learning of this skill. The pres- 
ent study is another in a series (Fleishman, 
1955; Fleishman & Spratte, 1954; Fleishman 
& Friedman, 1957a, 1957b; Fleishman, Rob- 
erts, & Friedman, 1958; Highland & Fleish- 
man, 1958) concerned with (a) the specifica- 
tion of the ability variables involved in this 
complex perceptual skill and (6) the improve- 
ment of assessment techniques in this area. 

Results in the armed forces indicate that 
certain auditory tests are more predictive of 


reasons, 


subsequent code speed attained by radio op- 
erators than any combination of printed tests 
(see, for example, Craeger, 1954; Fleishman, 
1955; Fleishman, Roberts, & Friedman, 1958). 
However, until recently, littlke was known 
about the abilities tapped by these tests. In a 
recent study, Fleishman, Roberts, and Fried- 
man (1958) administered a battery of audi- 
tory and printed tests to entering radio op- 
erator students. A factor analysis of the cor- 
relations of the tests and a criterion of code 
proficiency was carried out. The loadings of 
the criterion variable on the factors defined 
by the reference tests helped specify the 
sources of validity of such tests. Three of the 
five factors identified were found to contribute 
to code proficiency. Two of these were con- 


1 This study was carried out while the first author 
was with the Air Force Personnel and Training Re 
search Center, Lackland Air Force Base, Texas. Mor 
ton P. Friedman provided valuable assistance in the 
conduct of this study 


fined to auditory tests and were labeled Audi- 
tory Perceptual Speed and Auditory Rhythm 
Discrimination. The third factor was meas- 
ured by printed tests and appeared to be one 
of the perceptual Closure factors first identi- 
fied by Thurstone (see French, 1954). 

While these results throw light on the abili- 
ties contributing to code proficiency they do 
not tell us about the relative contribution of 
these abilities through the learning period. 
Other studies by Fleishman (1955, 1956 
1957) and by Fleishman and Hempel (1954 
1955) indicate that abilities contributing to 
proficiency on complex tasks may change as 
practice continues and proficiency increases 
It was decided to investigate this possibility 
for the task of learning to receive Morse code. 
The study represents an extension of these 
laboratory studies of skill learning to an 
actual job training situation. 


PROCEDURE 


A battery of 14 aptitude tests previously described 
(Fleishman, Roberts, & Friedman, 1958) was ad 
ministered to 310 airmen entering radio operator 
training. During the training, students are given daily 
“code-checks” as part of the systematic evaluation 
of progress. These code-checks indicate the students’ 
level of proficiency, in terms of the number of code 
groups per minute he can present 
study, it was possible to determine the number of 
days? it took each student to reach four criterion 
proficiency levels. Thus, the four criterion measures 
used were the number of days needed (a) to learn 


receive. In the 


to receive 4 groups per minute, (6) to advance from 
4 to 6 groups per minute, (c) to advance from 6 to 
10 groups per minute, and (d) 
to 14 groups per minute 

Previously, Fleishman, 


to advance from 1 


Roberts, and Friedman 
number of academic days 
actually spent in training during each period by each 
student 

absent be 


> By this is meant the 


Each student’s record was adjusted for days 


cause of illness, holidays, or special events 
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Fic. 1. Learning curve showing average number 


of days required to reach different criterion levels. 
(N 310). 


(1958) factor analyzed the correlations? among the 
14 aptitude tests previously administered to three 
students. Five aptitude factors were identified. In the 
present study, correlations were obtained between the 
four “stages of learning” criterion measures (de- 
scribed above) and the aptitude test scores. Esti- 
mates of the loadings of the four stages of learning, 
on the rotated ability factors, were obtained by 
means of Mosier’s extension method, as outlined by 
Fruchter (1954). Also obtained were the multiple 
correlations between the ability measures and the 
number of days required to reach each of the four 
criterion levels. 
RESULTS 


The mean number of days it took the group 
to advance to each successive level of pro- 
ficiency are as follows: 14.53 days (SD = 
4.57) to attain a speed of 4 groups per minute, 
4.38 days (SD = 4.07) to go from 4 to 6 
groups per minute, 19.24 days (SD = 8.48) 
to go from 6 to 10 groups per minute, and 
25.49 days (SD = 10.53) to go from 10 to 
14 groups per minute. Figure 1 shows the 
learning curve plot of these data. It can be 
seen that the curve based on these four points 
is quite smooth and resembles many other 
learning functions. We would need more 
points to demonstrate the appearance of a 
plateau (which Bryan and Harter [1899] 
originally demonstrated), but the absence of 
a plateau in our data is consistent with more 
recent evidence (Keller & Jerome, 1946; Reed 
& Zinszer, 1943; Taylor, 1943) that plateaus 
in code learning are exceptional. 

8 Since students qualified for training on the basis 
of the Radio Operator Aptitude Index (a weighted 
composite of Air Force Tests), all correlations were 
corrected for restriction of range due to 
lection 


prior se- 
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The intercorrelations among these practice 
stages are presented in Table 1. It can be 
seen these correlations are not high; in other 
words, individuals who reach a code speed of 
4 groups per minute most quickly, are not 
necessarily the same individuals who progress 
from 4 to 6 groups, etc. most quickly. 

Table 2 presents the correlations obtained 
between the ability measures previously ad- 
ministered and the four criterion scores subse- 
quently attained. It is immediately clear from 
this table that the validities of these tests for 
predicting speed of code learning drop mark- 
edly from early to late stages of training. The 
Aptitude Index, used routinely by the Air 
Force to assign individuals to such training, 
drops from .47 to .32, .15, .20 through the 
subsequent stages. Clearly, most of the va- 
lidity for such tests in predicting final stand- 
ing is attributable to successful prediction 
during initial phases of training. 

To clarify some of these relationships we 
now turn to the projection of these four cri- 
teria on thé factor structure of these tests 
previously established (Fleishman, Roberts, 
& Friedman, 1958). The five orthogonal, ro- 
tated factors extracted from the tests given to 
these same 310 students were defined as fol- 
lows. 


Visualization. This represents the ability 
to make mental manipulations of visual ob- 
jects, in which it is necessary to twist, turn, 
or invert one or more parts of a configura- 
tion (in imagination) and to recognize the 
new position, location, or changed appear- 
ance after the modification. The best de- 
finers were the printed tests, Pattern Com- 


TABLE 1 
INTERCORRELATIONS OF NUMBER OF DAvs REQUIRED 
TO PROGRESS BETWEEN THE FouR 
CRITERION LEVELS 


4 to 6 
gpm 


6 to 10 
gpm 


10 to 14 


gpm 


To 4 gpm 28 .29 
4 to 6 gpm 41 


6 to 10 gpm 
10 to 14 gpm 





Factor Structure 


TABLE 2 


CORRELATIONS* BETWEEN THI 


rO PROGRESS BETWEEN THI 


Test Variable To 4 gpm 


Auditory Tests 
Dot Perce 


Copying 


puion 
2oehir | 


2eCnIN¢ 
Radio Code Test 


Hidden Tunes 


Army 


Rhythm Discrimination 


Printed Tests 
Mutilated Words 
Four-Letter Words 
Designs 
Pattern Compreher SIOI 
Concealed Figures 
Gestalt Completion 
Marking Accuracy 
We rd Knowledge 
Background for Current 


prehension, Concealed Figures, and De- 
signs. 

Verbal Ability. The best definers were the 
Word Knowledge and Background for Cur 
rent Affairs Tests. 

Auditory Rhythm Discrimination. This 
was general to auditory tests, where the 
critical feature was perception of rhythm 
patterns rather than speed. Best definers 
were tests called Hidden Tunes, Dot Per- 
ception, and Rhythm Discrimination. 

Speed of Closure. This is defined as the 
ability to unify or organize an apparently 


disparate field into meaningful units. The 


ABILITY TESTs 


THE Nt 


Four CRITERION 


AND MBER OF Days REQUIRED 


LEVELS 


6 to 10 gpn 


4 to 6 gpm 


— mm DD WD 
m OWN hw Ww 


best definers were the printed tests called 
Mutilated Words and Four Letter Words. 
Auditory Perceptual Speed. This was best 
defined by auditory tests in which the criti- 
cal feature was sheer speed of recognition 
of the stimuli presented. The best measures 
of this were the Copying Behind, Army 
Radio Code, and Dot Perception Tests. 


The complete description of these tests and 
factors and the rotated factor matrix maj 
be found in Fleishman, Roberts, and Fried- 
man (1958). 

Table 3 presents the loadings of the four 
criterion measures used in the present study 


rABLE 3 


Learning Period 


Number of days required to 
Receive 4 gpm 


Progress from 4 to 6 gpn 


Progress from 6 to 10 gp 
Progress from 10 t 
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(and represented in the four points on the 
curve in Figure 1) on these five factors. The 
clearest trend in this table is the progressive 
decrease in the communalities from early to 
late stages of training. This, of course, was 
obvious from the correlations. The commu- 
nality, which is .48 in the initial stage, drops 
to .28 in the second stage, .22 in the third 
learning period, and .19 for the final stage. In 
other words, later in learning, there is less 
variance in common with aptitude test scores 
than in earlier learning periods. 

The secon@ indication is that there is a 
shift in the particular factors contributing to 
performance at different stages of learning. 
During the early stage of training the fac- 
tors of Auditory Rhythm Discrimination and 
Auditory Perceptual Speed are the only fac- 
tors with loadings above .30. Thus auditory 
abilities seem most related to individual dif- 
ferences in speed-of-learning in the initial 
learning period. 

Inspection of the loadings for the second 
learning period indicates that while the esti- 
mated loadings are generally low, the highest 
of them is on the Speed-of-Closure factor. 
Auditory Perceptual Speed has dropped from 
.48 to .28 and Auditory Rhythm Discrimina- 
tion has dropped from .37 to .18. (The in- 
crease in the Speed-of-Closure factor is at- 
tributable to the fact that the Mutilated 
Words and Four Letter Words tests are the 
only two tests with either an increase or no 
change in their correlation from Stage 1 to 
Stage 2.) There are no loadings as high as .30 
during the last two stages of training. 

The last column of Table 3 presents the 
multiple correlation coefficients between 14 
aptitude test scores and each of the four 
“stages of learning” criterion scores. The re- 
sults of this analysis also indicate that the 
initial learning period is the most predictable, 
with the size of the multiple Rs dropping off 
for the later learning periods. 

DISCUSSION 

It is possible that the decreasing predict- 
ability of performance is due to some loss in 
criterion reliability at later training stages. 
We have no direct evidence on this. However, 


data collected in early training are sufficiently 
reliable to allow the high validities obtained. 
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It would take a drastic drop in criterion reli- 
ability at later stages to account fully for the 
validity decreases obtained at advanced lev- 
els. Moreover, evidence from other studies of 
learning data (e.g., Reynolds, 1952; Adams, 
1954) indicates that consistency of perform- 
ance during learning is actually higher during 
later stages. It is also clear that the between- 
S variability is sufficiently large during late 
stages of training and restriction of range is 
not a problem. 

Furthermore, the results obtained in this 
study of operational training performance, on 
an auditory-perceptual task, are consistent 
with earlier findings (Fleishman, 1956, 1957; 
Fleishman & Hempel, 1954, 1955) in labora- 
tory studies of perceptual-motor tasks. In each 
study it has been found (a) that the pattern 
of abilities contributing to performance on 
complex tasks changes progressively with prac- 
tice, and (6) there is an increase in a factor 
specific only to the stages of practice on each 
of the tasks. This provides the alternative hy- 
pothesis that individual differences in per- 
formance become increasingly a function of 
habits and skills acquired during practice with 
the task itself. 


SUMMARY 


A battery of 14 ability tests was adminis- 
tered to 310 entering radio operator students. 
Records were obtained of the number of days 
required by these students to reach four suc- 
cessive proficiency levels of Morse code re- 
ception. These were the number of days re- 
quired to receive at a speed of (a) 4 groups 
per minute, (6) to progress from 4 to 6 
groups per minute, (c) to progress from 6 to 
10 groups per minute, and (d) to progress 
from 10 to 14 groups per minute. Correla- 
tions were obtained between the ability tests 
and student progress during the four stages 
of training. A factor analysis of the ability 
measures was Carried out and the factor struc- 
ture of the criterion measures determined. 

The results indicate that early in training in- 
dividual differences in speed of learning Morse 
code are, to a great extent, a function of two 
auditory abilities, Auditory Rhythm Dis- 
crimination and Auditory Perceptual Speed. 
In an intermediate stage these abilities play 
a smaller role with Speed of Closure (the abil- 
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ity to organize stimuli meaningfully) of some 
importance. Later in training individual dif- 
ferences in these common abilities is less 
critical. The overall predictability of pro- 
ficiency in code learning decreases as learn- 
ing continues and proficiency increases. This 
indicates that currently used tests in this area 
are valid mainly for predicting who will suc- 
ceed or fail in the initial periods of Morse 
code training. Progress at later proficiency 
levels appears to be less a function of general 
ability variables and more a function of spe- 
cific habits acquired in training. 
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VALIDATION OF THE ABSTRACTION INDEX AS A 
TOOL FOR CONTENT-EFFECTS ANALYSIS 
AND CONTENT ANALYSIS 


JACK B. HASKINS 


The Curtis Publishing Company 


One of the criteria used to judge literary 
materials abstractness. Abstractness con- 
notes the generalizability, universality, and 
permanence of the ideas expressed in a piece 
of writing, as opposed to writing which evokes 
concrete, specific, and detailed images. 

Abstractness and its connotations may thus 
logically be placed at one end of a continuum 
with concreteness and its connotations at the 
other end. Gillie (1957) has described the de- 
velopment of a simplified method for measur- 
ing this particular literary quality, its meas- 
ure being the “Abstraction Index.” 

An objective measure of abstractness can be 
useful in at least two ways: Straight content 
analysis which describes and compares various 
kinds of literary output (e.g., 18th century vs. 
20th century novels, 2nd grade vs. 10th grade 


is 


history texts, English newspapers vs. Ameri- 


can newspapers, Reader’s Digest vs. The 
Saturday Evening Post, etc.) ; Content-effects 
analysis which relates literary characteristics 
to the behavior and attitudes of human be- 
ings in a communications situation. 
Practitioners of straight content analysis fre- 
quently make inferences as to effects without 
validation of their content measures or cate- 
gories, that is, without ascertaining whether 
their content hypotheses are empirically re- 
lated to the effects ascribed; there is inference 
backward to the communicator or to the situa- 
tion described (e.g., “Philadelphia newspapers 
contain more crime news than other papers; 
therefore Philadelphia has more crime than 
other cities’) and inference forward to the 
communicatee (e.g., “Philadelphia newspapers 
print more crime news than other papers, there- 
fore Philadelphia citizens read more crime 
news than His- 
torians are especially fond of such inferences. 


residents of other cities’’). 


lhe purpose of this paper is twofold: 


1. To determine the relationship in vari- 
ous magazine items between abstractness, 


102 


as measured by Gillie’s Abstraction Index 
(AI), and reader behavior and attitude. 
This is a test of the validity of AI against 
behavioral criteria. 

2. Some methodological testing of the AI 
itself to determine the extent and kind of 
content sampling necessary to get a reli- 
able index, and to make some practical use 
of the AI in describing the content and 
stylistic tendencies of an issue of The 
Saturday Evening Post. 

It is desirable, of course, for the AI to 


demonstrate some validity in Phase 1 before 
using it as a measure in Phase 2. 


METHOD 


Two hundred word samples of content were taken 
from the 12 
stories, serials) 


major editorial items (articles, short 
in the March 1959 issue of The 
Saturday Evening Post. The samples were taken sys- 
tematically; starting with the second paragraph of 
each item 200 words were counted, 400 words were 
skipped, 200 counted, etc., until the end of the item 
That each sample after the first 
started 600 words from the beginning of the preced- 
ing sample; roughly one third of the content of each 
item was tested in this way. The AI as described by 
Giilie was computed for each sample thus chosen 
The average of all sample AIs in an item was 
taken as the whole-item AI. The mean of the 12 
item AlIs was taken the issue AI. The results of 
this sampling and computation are shown in Table 1 
\ nationwide sample of readers (N = 340) of the 
March 1959 Post were questioned on (a) reader 
ship of 
recorded 


was reached is, 


as 


the items (four degrees of readership were 
did not saw, read some, read all) and, 
(b) satisfaction with items read (measured on a five 


step verbal scale, “excellent” t 


see 
oO “poor”) 
CONTENT-EFFECTS ANALYSIS: HYPOTHESES 
AND RESULTS 


The following hypotheses and results show 
the relationship between and 
readership /satisfaction. 


abstractness 


indebted to 


is Margaret King, Re- 
who performed all AI computations 


1 The author 


search Assistant, 
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TABLE 1 


ComPpuTEeD ABSTRACTION INDEX®* 
Marcu 7, 


Position of Sample within Item 


Whole Item 
(Body-Text) 


Title and 
Subtitle 


67 
36 
35 
37 
40 
32 
39 
30 
40 
26 


22 


Mean, All 


items 


“Tnitial attention value” of a magazine item 
is indicated by the percentage of the audience 
who saw the item (saw %). Attention is pre- 


sumably influenced by the display elements of 
the item—the title and subtitle, illustration, 
use of color, layout, type face, etc.; the ab- 


FOR ITEMS 
1959 Saturday Evening Post 


AND SAMPLES WITHIN ITEMS FROM THI 


Approx 


Length 


Words) 


5 7 8 9 11 12 
3900 
3900 
5100 
5100 
3900 
6900 
3300 
4500 
3300 
4500 
3900 


47 64 58 64 7500 


50.3 52.3 49.0 53.0 64 


54.5 


stractness of the body-text would presumably 
not affect initial attention. 

Thus, Hypothesis 1: Initial attention value 
of a magazine item is not related to abstract- 
ness in the body-text. CONFIRMED; the 


rank correlation between “saw %” and item 


TABLE 2 


ABSTRACTNESS AND READERSHIP MEASURES ON 12 Saturday Evening Post ITEMs 


B 


Attention 
Value 
(“Saw %’’) 


c 


Item AI Title and 
Body Subtitle 
Item Text) Al 
67 84° 

36 &3 

35 85 

37 &4 

40 93 

96 

8&8 

83 

8&3 

92 


ccordance with Curtis editorial poli 
ot affected 


*Ina 
each column is rt 


Started 
To Read 


Some %’’)* 


n these columns have been altered by 


D 5 F G 


Finishing Satisfaction 
Rate (Read (% of 
All %+ Readers 
Read Who said 


Some %) “Excellent’”’)* 


Finished 
the Item 
(“Read 


All %” 


“Read 


49% 46° 93% 30% 
39 30 78 32 
38 33 86 39 
44 45 
5 ? 


J 


mam uw vi 


~ 


en 4) 
an 


wun 
wn w dN 


39 27 


ai 


39 39 


wn 


a constant; the re 
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TABLE 3 
RANK CORRELATION AMONG READERSHIP AND 
ABSTRACTNESS MEASURES IN TABLE 2 
Rank r 


Hypothesis Variables 


AC — .02 
BC +.04 
BD +- 02 
AD +- 40 
Al — .79* 


+ 80* 


* Probability 


AI is —.02 on the 12 items, not significant 
(Tables 2 and 3). 

However, since the title and subtitle of a 
magazine item are among the cues affecting 
attention value, we would expect abstractness 
in the title and subtitle to be negatively re- 
lated to the “saw %.” 

Thus, Hypothesis 2: Initial attention value 
of a magazine item is negatively related to 
abstractness in the title and subtitle. NOT 
CONFIRMED; rank correlation between 
“saw %” and title and subtitle AI is +.04, 
not significant (Tables 2 and 3). 

The proportion of an audience who start to 
read an item (“‘ read some %”’) is presumably 
a function of the initial attention value of the 
item; if interested in the apparent subject 
matter and not discouraged by the title and 
subtitle abstractness, an individual will start 
to read the item. 

Thus, Hypothesis 3: The proportion of the 
audience starting to read an item is nega- 
tively related to abstractness in the title and 
subtitle. NOT CONFIRMED; rank correla- 
tion between “read and title and 
subtitle AI is +.02, not significant (Tables 2 
and 3). 

The people who start reading an item can- 
not be expected to know, at that point, the 
level of the body-text of the 


some 


abstractness 
whole item. 

Thus, Hypothesis 4: The proportion of the 
audience starting to read an item is not 
related to body-text abstractness. CON 
FIRMED; rank correlation between “read 
and body-text AI is +.40, not sig- 
nificant (Tables 2 and 3). 

The “finishing rate” of a magazine item is 


some %”’ 


the proportion of those starting to read who 
finish it (number who read all + number who 
read some). Of those who start to read an 
item, some will be discouraged by abstract- 
ness or “toughness.” 

Thus, Hypothesis 5: The proportion of the 
starters who finish an item is negatively 
related to body-text abstractness. CON- 
FIRMED; rank correlation between “‘finish- 
ing rate” and body-text AI is —.79, signifi- 
cant at .01 (Tables 2 and 3). 

Presumably, more satisfaction is derived 
from completion of a tough task than an easy 
one. Equating abstractness with toughness, we 
expect more satisfaction in the reading of an 
abstract item than in a concrete item. The 
“excellence %” of an item (the proportion of 
those reading an item who rate it excellent) 
is a measure of satisfaction. 

Thus, Hypothesis 6: The proportion of 
readers who rate an item “excellent” is 
positively related to body-text abstractness. 
CONFIRMED); rank correlation between the 
“excellence %’’ and body-text AI is +.80, 
significant at .01 (Tables 2 and 3). 


CONTENT ANALYSIS: HYPOTHESES 
AND RESULTS 


The following hypotheses and results are 


tests of methodology in the use and applica- 
tion of the Abstraction Index. 

Abstractness implies generality and sum- 
mary as opposed to concreteness and speci- 


ficity. The title 
magazine 


subtitle) of a 
presumably functions as a 
synopsis or abstract of the full item, to pro- 
vide the prospective reader with the cues on 
which to base his decision to read or not read. 

Thus, Hypothesis 7: Titles of editorial 
items are in general more abstract than the 
body-text. CONFIRMED; mean ATI for titles, 
36.6; for body-text, 47.9 (Table 1). 

In applying the AI in practical situations, 
the question arises as to how many samples 
of content are 


(including 
item 


necessary to achieve reliable 


results. An item perfectly homogeneous re- 
garding abstractness would require only one 
200-word sample to reflect the item accu- 
rately. Since this is an unlikely situation, the 
relationship of one-, two- and three-sample 
Als, selected from varying systematic posi- 
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tions within the items, compared to the whole- 
item AI, was determined. 

Thus, Hypothesis 8: One-, two- or three- 
sample AIs within items are significantly and 
positively related to whole-item AIs. CON 
FIRMED. Rank correlations with whole-item 
abstractness are follows: one- 
sample AIs, +.77 to +.91; two-sample Als, 
+.84 to +.96; three-sample Als, +.88 to 
+.95. None of the sampling 
vield mean values greater 


obtained as 


procedures 
than 1.7. AI points 
from the mean for whole items; all mean 
range (43 


“standard” 


well within the 
Gillie as 


ue) values are 
54) designated by 
(Table 4) 
Hypothesis 9: 


ns within 


ab- 

Stractness 

Samples from certain posi- 
better 

whole-item abstractness than are samples from 

other positions. CONFIRMED; the be 

le ¢ 


| Le 1S 


Ol items are indicators of 


sition from which to take one 
the middle of items, the best 
tests at the 
for three-sample tests the beginning, 


ind e! d 


Sami 
alll) 


for two-sample 


beginning and end, and the best 


middle 
system- 


In general, samples taken 
atically from any of the named positions will 
give a good indication of the whole-item AI 
(Table 4). 

While the two foregoing hypotheses demon- 


ahi 
strate that the content within items is rela- 
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tively homogeneous regarding abstractness 
(otherwise one-sample AIs would yield little 
correlation with whole-item 
there remains the 


abstractness ) 
possibility that 
style changes in some systematic 


writing 
way through- 
a writer might be envisioned as be- 
coming “looser” or 


out items; 
more specific as he pro- 
ceeds in his writing. 
Thus, Hypothesis 10 
a magazine item tends to change in a system- 
atic way as 


The abstractness in 
one sses through the item 
NOT CONFIRMED 
while 8 of the 12 items show a tendency to 


progre 
from beginning to end. 


become less abstract as one progresses through 


item, the relationship is in general so 


hat no systematic change can be as- 
cribed (Table 5). 

Finally, in 
scores (( 


31-42 


6, fairly concrete; 67—78, concrete: 
very concrete), half items ranged over 


half 
none of the items was wholly con- 
The range of 
scores among items within the Post issue was 


from 29 


two categories and ranged three 


ove! 
categories: 
tained within 


one category. 


(abstract) to 58 (fairly concrete), 


with the issue AI at 48 (standard) (Table 1) 


TABLE 4 


RANK CORRELATION AND MEAN FOR ON} 


witH AI ror WHOLE ItTEMs 
Number of 


200-W ord 
Samples 


Position of 


Sample(s 


Within Item 


First (I 
Middle (M 
Last (L 
F&L 

First Two 
Middle Two 
Last Two 
F&M 
M&L 

F, M, & L 
First Three 
Middle Three 
Last Three 


WNON WN NN NR 


ww 


w 


* Computed from 
b All values pos 
¢ Compare with 


values in Table 1. 
> and significant at 
mean AI for whole iter 


01 level. 
ns, 47.9 


, Two-, 


IN The Saturday 


AND THREE-SAMPLE Als 


Post® 


COMPARED 


Evening 


Rank Correlation 
with Whole-Item 
AT Mean AI* 
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TABLE 5 
RELATIONSHIP OF SAMPLE SERIAL POSITION 
WITHIN Items TO SAMPLE AI 


Number of Samples 


Item Rank r Within Item 


4 
81 
AS5 
05 
32 
.09 
37 
AS 
30 

30 
09 
34 


ne Wh 


1a 


SUMMARY AND CONCLUSIONS 


One way of describing literary materials is 
in terms of “abstractness.” The major purpose 
of this study was to determine the effects of 
abstractness in writing on reader behavior and 
satisfaction. 

The Abstraction Index (AI) was computed 
for all major articles and stories in an issue of 
The Saturday Evening Post. A national sam- 
ple of readers of this issue were interviewed 
to get readership and satisfaction data on the 
items. 

Abstractness was found to be a discourag- 
ing factor to some readers’ finishing an item; 
the more concrete the item, the more likely 
they were to finish reading it. The rank cor- 
relation between “finishing rate” and abstract- 
ness was —.79 on the 12 items. 

However, abstractness is a desirable quality 
regarding the satisfaction a reader gets from 
an item; the more abstract the item, the 
higher his satisfaction. Among those readers 
who completed the reading of an item, the 
rank correlation between abstractness and the 


proportion who rated the item “excellent” on 
a 5-step satisfaction scale was +.80. Also, the 
absolute number of persons getting high satis- 
faction from abstract items was greater than 
for more concrete items, in spite of the lower 
readership of the former. 

Abstractness in the body-text did not affect 
initial attention to items. Abstractness in the 
title and subtitle did not affect either initial 
attention to items or the proportion of per- 
sons who started reading the body-text. 

Some were conducted to determine 
how many 200-word AI samples were neces- 
sary to give reliable indications of whole-item 
abstractness, and to determine the optimum 
location within items from which to take sam- 
ples. It was found that one-, two- or three- 
sample AIs were adequate for determination 
of whole-item abstractness. The optimum lo- 
cation for one-sample AIs was the middle of 
items; for two-sample Als was the beginning 
and end of items; for three-sample AIs was 
the beginning, middle, and end of items. 

The titles (including subtitles) of items 
were found to be more abstract than the body- 
text of the items. 

Eight of the 12 ‘tems showed a tendency 
to become less abstract, more concrete, as one 
progressed through the item from beginning 
to end; however, the tendency was so weak 
as to be nonsignificant. 

It is concluded that the Abstra¢tion Index 
is a useful tool in the measurement of reader 
reaction to magazine editorial items (content- 
effects analysis) and has validity in the de- 
scription and evaluation of literary materials 
(content analysis). 


tests 


REFERENCE 


Griurz, P. A simplified formula for measuring ab- 
straction in writing. J. appl. Psychol., 1957, 41, 
214-217. 


(Received June 29, 1959) 





EVALUATING TERRITORIAL SALES EFFORTS 


ROBERT 


State Farm Insurance ( 

[here were two primary objectives in con- 
ducting a study concerned with evaluating 
territorial sales efforts. The first was the de- 
velopment of a valid, objective, reliable meas- 
ure (or measures) to use in evaluating the 
automobile insurance sales efforts of each of 
30 regional sales organizations. In construct- 
ing this measure the differing potentials for 
sales in the separate territories, as well as the 
degree of realization of this potential, were to 
be taken into account. Second, was the deter- 
mination as to whether or not effectiveness per 
agent in each territory was related to the mar- 
ket potential per agent. 

The study is presented as an example of 
the application of the method of factor analy- 
sis to the problem of attaining the above ob- 
jectives. 


BACKGROUND 


The corporation in which this evaluation 
was conducted is engaged in marketing auto- 
mobile, life, and fire insurance. Automobile in- 
surance is the basic product and, as such, is 
the primary source of income to its agents. 
The agents are independent contractors, but 
are supposed to represent the corporation 
solely. The district manager, an employee of 
the companies, selects, trains, and supervises 
the agents within his territory. There is one 
manager for approximately every 12 agents. 
The managers’ supervisors consist of State Di- 
rectors, Assistant State Directors and Agency 
Supervisors. Each state (or in some cases a 
group of small states) has one of these sepa- 
rate and relatively autonomous management 
groups. Although there are over 7500 agents 
in the United States, not all of them work 
full time and not all are sole representatives. 

The companies’ philosophy concerning the 
employment of salesmen changed abruptly in 
1950 from the policy of contracting agents on 
a part-time basis and permitting them to rep- 
resent other companies to one of recruiting 
agents on a full-time basis requiring sole rep- 
resentation. Consequently, there are still a 
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number of part-time agents who were con- 
tracted under the old policy in parts of the 
companies’ operating territory. The number 
of part-time agents in a territory is, therefore, 
related to the time when the companies went 
into that market. In state sales organizations 
where the companies have been well estab- 
lished for a number of years, there are pro- 
portionately more part-time agents than in 
states where it has been operating only in the 
last few years. This last fact makes any pro- 
cedure for evaluating a state’s progress very 
difficult. It is one of the problems which was 
attacked in the construction of the measures 
described below. 


PROCEDURE 


The chief interest in this study was performance 
the of automobile insurance only. Conse- 
quently, we will be concerned with licensed agents 
and performance data solely from the records of the 
automobile company 


in sale 


Nineteen different performance 
measures about each state were selected for inclusion 
in the study. The figures were based on the year 
1955. These measures were: 

1. Private and commercial automobile registrations 
for 1955. 

2. Policies in force in the company. 

3. Potential (Variable One minus Variable Two). 
This represented an estimate of the number of auto- 
mobiles registered which were a potential additional 
market for the company. (In some states the com- 
pany has policies covering more than 20% of the 
registered automobiles.) 

4. Total New sales production consisted of 
both new and reinstated automobile applications 

5. Total number of agents. This was the 
number of agents who were under contract 
January 1, 1955. 

6. Number of career agents 


sales 


total 
as of 
A career agent in the 
company was one who earned 1200 points or more 
during the calendar year. These points were a man- 
agement-determined subjective system for evaluating 
individual sales efforts. Two points were given for 
each automobile coverage sold, eight points for each 
$1000 of life insurance sold, and two points for each 
$100 of fire premium sold. Management considered 
1200 points a minimum full-time job for an experi- 
enced agent. Therefore, the number of career agents 
was approximately the number of agents who were 
doing a full-time job. This tended to eliminate the 
part-time agents. 
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Total number of jobs. This represented the esti- 
mate of the number of effective jobs which were in 
existence in each state. An effective job was defined 
in terms of the total number of points produced dur- 
ing the vear. Arbitrarily, the attainment of 1200 
points or more was designated as one full-time job 
In order to account for the fact that different terri- 
tories had different proportions of part-time to full- 
time agents, the following procedure was used: The 
total number of points for all agents who made 
fewer than 1200 points were found for each state 
and this total was divided ;by 1200 in order to ar 
rive at the number of full-time jobs being accom- 
plished by part-time agents in the state. This num- 
ber was added to the number of agents producing 
1200 points and more. The resulting figure was an 
approximation of the number of full-time jobs being 
done in the state 

It should be noted that there is some circularity of 
reasoning involved due to the fact that a job was 
defined, in some measure, in terms of the number of 
sales. However, as the total number of points from 
all three lines were used as a measure of the job, 
there should be less of this circularity present. 

It was recognized that the method of determining 
the number of total jobs was not an adequate substi- 
tute (for statistical purposes) for having all agents 
working full time. However, under existing condi- 
tions, it seemed that the method used would give an 
estimate which would be of value 

8. Percentage of sales to potential (Variable Four 
divided by Variable Three) 

9. Potential per number of jobs (Variable Three 
divided by Variable Seven). 

10. Registration per career agent (Variable One 
divided by Variable Six). This gave a different esti- 
mate of the market per agent than the one shown in 
Variable Nine. Because there were more registrations 
than potential and fewer career agents than number 
of total jobs, this index is an upper limit of the 
market for each agent in a state 

11. Percentage of agents who were career agents 
(Variable Six divided by Variable Five) 

12. Average point production of part-time agents 
This was the average number of points produced by 
agents who worked the entire year but who had less 
than 1200 points 

13. Percentage of career agents to total jobs (Vari 
able Six divided by Variable Seven) 

14. Total sales per number of jobs (Variable Four 
divided by Variable Seven) 
total 
Four divided by Variable Five) 


15. Sales per number of agents (Variable 


16. Sales per policies in force (Variable Four di 
vided by Variable Two) 

17. Ratio of increase to written business. This was 
the ratio of increase of business over that of the 
previous year to the business written during the year, 
and gives a measure of the persistency of the busi- 
ness. 

18. Policies in force per number of jobs (Variable 
Two divided by Variable Seven) 
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19. Loss ratio. The loss ratio was obtained by di- 
viding the loss, through claims incurred for that pe- 
riod, by the total amount of premium in force for 
the same period. 

Rank correlations by state (NV = 41) among these 
19 measures were computed and factor analyzed us- 
ing the Thurstone multiple-group method. The re- 
sulting factors were rotated graphically to orthogonal 
simple structure 


RESULTS 


Table 1 shows the rotated factor matrix. 
There were no residuals greater than +.10. 


DESCRIPTION OF FACTORS 


Factor I—Absolute Size. This factor was 
the dimension which represented the size of 
the state as measured by total car registra- 
tions, policies in force, potential, total sales, 
total number of agents, number of career 
agents, and total number of jobs in the state. 

Factor Il—Potential per Agent. This fac- 
tor was the measure of potential available to 
each agent in the state, and was identified by 
sales to potential, potential per job, and regis- 
trations per career agent. There was a slight 
relationship between sales per policy and 
policies per job to the factor. 

Factor 111—Over-all Effectiveness. This fac- 
tor is the dimension which represents the best 
combination of the 19 original measures that 
gave an over-all measure of the effectiveness 
of the state sales organization. It was meas- 
ured by sales per agent and sales per job and 
also identified by percentage of agents who 
were Career agents, production of part-time 
agents, percentage of of career agents to total 
jobs, sales per policies in force, increase to 
written, and sales to potential. 

Factor 1V—Manpower Utilization. This fac- 
tor which was a group or subfactor of Factor 
III, Over-all Effectiveness, was identified by 
the percentage of agents who were full time, 
production of the part-time agents, percent- 
age of the agents to the total jobs in the state, 
and slightly, by sales per agent. 

Factor V—Rate of Growth. This factor was 
also a subfactor of Over-all Effectiveness and 
is identified by sales per policies in force, 
ratio of business, and 


increase to written 


policies per job (negatively). 
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rABLE 1 


RoTATED FactoR MATRIX 


I 


Absolute 


Variables Size 


Registrations Qs** 


Policies in force 97” 
Potential 92** 
lotal sales 94** 
Q3** 


95** 


Total agents 

Number of career agents 
Total number of jobs 97** 
Sales to potential O8 
Potential to jobs 07 
Registration per career agent 14 
Percentage of agents with 1200 points 5 
Production of part-time agents 33 


21 


Sales per job 18 


) 
33° 
Percentage of career agents to total ) 
. Sales per agent 
Sales per policy 19 
Increase to written business 18 
Policies per job 02 


Loss ratio 13 


DISCUSSION AND CONCLUSIONS 


There are two kinds of factors present— 
nonperformance and performance. The two 


are a function of 
something other than the ability of the agency 
force in the state and are Absolute Size and 
Potential per Agent. The three performance 
factors represent differences in performance of 


nonperformance factors 


the agency forces in the states, and consist of 
one general or global factor, Over-all Effec- 
tiveness, and the two subfactors, Manpower 
Utilization and Rate of Growth 

The three component parts of Over-all Ef- 
fectiveness were sales per agent, utilization of 
manpower, and growth rate in the state. 

The potential in an absolute sense or the 
market potential per agent was not related to 
two of evaluative measures—Over-all 
Effectiveness and Manpower Utilization. Fur- 
ther, we can conclude that effectiveness per 
agent in the state was not related to the 
potential per agent. Apparently other phe- 
nomena, such as managerial effectiveness, ex- 


these 


FOR 19 VARIABLES 


Factors 


Il II! I\ V 
Potential | Manpower 


Over-all Rate 
per Agent ness Utilization 


Effective of Growth 


03 04 10 
10 
—07 17 
07 + 
O8 
15 15 
00 
00 -18 
O8 
-47** 
63** 
56** 
68** 
04 
32* 
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tent of training, and experience of the sales 
organization, had the effect of negating any 
impact which potential might have had. 

The degree of realization of potential was 
taken into account, in part, in the measure- 
ment of over-all effectiveness, inasmuch as 
the amount of sales to potential was one of 
the individual measures which constituted the 
factor. Generally speaking, however, the de- 
gree of realization of potential appeared to be 
less important in the evaluation of the over-all 
effectiveness of the state than other measures 
such as sales per agent, manpower utilization, 
and growth rate. 

Finally, the absolute size of the sales or- 
ganization (Factor I) was not related to its 
performance (Factors III, IV, and V). 

The research has not yet led to an operat- 
ing report which will give factor scores as in- 
dicators of successful management. There is 
currently in progress a repetition of this study 
incorporating additional similar measures of 
life and fire insurance production, policies in 
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force, and market potential, along with man- 
power statistics such as turnover and financ- 
ing cost. When this study is completed, it is 
hoped that an operating report will result 
which will contain factor scores for each state 
on the relevant performance factors in the dif- 
ferent areas of endeavor. 


SUMMARY 


The purpose of this investigation was to 
provide a better method of evaluating the 
sales efforts of different state sales organiza- 
tions in the marketing of automobile insurance. 
Nineteen measures representing quantitative 
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production achievements, manpower statistics, 
over-all market statistics, amount of business 
in the area, and combinations of- these were 
factor analyzed. Five factors emerged, two of 
which were not evaluative measures and three 
that might ultimately be used as such. The 
five factors were: Absolute Size, Potential per 
Agent, Over-all Effectiveness, Manpower Uti- 
lization, and Rate of Growth. 

Although the first two factors were not di- 
rectly associated with the evaluation of the 
their presence gave some additional 
knowledge about the other three factors. 


States, 
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INDUSTRIAL PSYCHOLOGY IN PSYCHOLOGICAL 
ABSTRACTS, 1927-1959 


H. MELTZER 


Orchard Paper Company, St. Louis, Missouri 


Many men in industry, as well as psycholo- 
gists, are impressed with the rapid growth of 
industrial psychology. The rate and growth in 
what Haire chooses to call “. the segment 
within psychology devoted to problems re- 
lated to industry” (Haire, 1959, p. 169) is 
a fact which he thinks many of us do not 
realize and, if fully considered, is striking. 
There is no doubt about the growth oi the 
field as measured by increased membership in 
Division 14 of the APA. In the last decade its 
membership has increased almost four-fold. 

What is assumed without evidence is that 
along with the growth of membership there 
also has been a growth in productivity. “Psy- 
chologists,”’ we are told, “are becoming in- 
creasingly productive—or at least more pro- 
lific’ (Kunin, 1958, p. 475). In the field of 
industrial psychology what are the facts con- 
cerning increase in output over the years? 
Fortunately in the Psychological Abstracts 
there is an available source which makes it 
possible to compare output year by year in 
any given field from the time of its beginning 
in 1927. The articles are abstracted in brief 
and do not lend themselves to an evaluative 
study of the nature, comprehensiveness, or 
quality of the publications, but do lend them- 
selves to a count study, making possible a 
comparative study of output over the years. 
By comparing the number of articles reported 
in any given year with any other given year, 


it is possible to compare productivity consid- 


ered in terms of output per year and to ob- 
serve increases and decreases over the years. 
Also, it is possible to discover in what years 
the gains have been made, the extent to which 
they have been sustained over the years, and 
what the general growth trend has been as 
far as output is concerned. To determine these 
and related questions concerning productivity 
in the field of industrial psychology is the pur- 
pose of the present study. 


CONTRIBUTIONS REPORTED IN Psychological 
Abstracts, 1927-1958 


All articles appearing in the Psychological 
Abstracts under the categories of “Industrial” 
and “Personnel” (sections classified as “In- 
dustrial and Personnel Problems,” “Personnel 
Psychology,” and “Industrial and Other Ap- 
plications”) psychology from its beginning in 
1927, when it first appeared as a separate 
monthly, until 1959, were tabulated for the 
purpose of making the comparative growth 
study over the years. The number of articles 
abstracted annually from 1927 to 1958, in- 
clusive, is reported in Table 1. The data re- 
ported from 1927 to 1946 are taken from 
tables reported by Flanagan (1947, p. 145) 
in his contribution to personnel psychology in 
the 1947 University of Pittsburgh studies on 
current trends in psychology. In summary 
fashion these data are presented by Meltzer 
(1948, p. 332). Here, considered in terms of 
decades, there is reported a definite increase, 
jumping from 150 in 1927 to closer to an av- 
erage of 250 in the 1930’s and hitting above 
500 in 1946. This last spurt was described at 
that time as an overflow of war-accumulated 
studies in psychology and psychiatry. 

What does a study of Table 1 show about 
the growth, extent of productivity over the 
years, gains made, gains sustained, and the 
general trend considered in the light of in- 
crease in the present number of contributors 
who are identified with the field of industrial 
psychology? A glance at the table does show 
increases at times, but the increases are not 
continuous and not sustained. For whatever 
the reasons be, there were peak years—1930 
1934, 1946, 1951, 1955—but they have been 
preceded and followed by leaner years. As re- 
ported in the Adstracts, it seems that indus- 
trial psychology cannot stand prosperity and 
has its recessions. Outstandingly low years 
were 1939, 1941, and 1958. 
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rABLE 1 


INDUSTRIAL Psy« ARTICLES 


HOLOGY 


Year 
1927 
1928 
1929 
1930 
1931 
1932 


1933 


1939 
1940 
1941 
1942 


1934 


er 1947 tl 


To avoid, or at least to reduce the possi- 
bility set 
unduly influencing or distorting the output of 


of one of circumstances or effects 


any given year as an index of a general trend, 
we have tabulated the data in spans of three 
vears. So viewed, the discontinuity and incon- 
sistency of trend appears lessened, but is still 
there. What shows up this way is a total out- 
put of 506 articles in the first three years, and 
851 and 853. articles in successive three-year 
spans, which include 1935. From 1936 to 
1942 there is a six-year period of decrease, 
from 795 for the first three years to 694 for 
the last three—ending in 1941. From 1942 to 
1957 there is a general increase, considered 
in three-year spans. For the three years be- 
ginning with 1942 there are 964 articles. The 
total for 1945, 1946, 1947, for the first time, 
above 1000 articles—1314. The total for 
the three-year spans is well above 1000 arti- 
cles for every period through 1956. The low- 
est year in more than a decade is 1958, and 
the possible reasons for this are considered in 
the following paragraph. 


1S 


EXPLAINING INCREASES AND 
OVER THE YEARS 


DECREASES 
How can the fluctuations and variations, 
the ups and downs reported in Table 1 be ex- 
ylained? Of course, the recession is the first 
thing that comes to mind for most people as 
an explanation for everything in 1958 that is 


low. Now whether the is 
sponsible to any measure for the low number 
of abstracts in that year is questionable 

the Abstracts 


were not well covered due to changes in the 


1958 recession re- 


There is some evidence that 


IN 


Psychologica 


Year 


Editorial Office. For example, 
1959 


in the February 
of Psychological Abstracts, there 
200 abstracts under ‘Personnel 
and “Industrial and Other Ap- 
all of which were published in 
eight books and three articles 
1958, and three articles pub- 
lished in 1956. This may be a usual backlog, 


issue 
were almost 
Psychology” 
plications,’ 
1957, except 
published in 


though it seems somewhat delayed and may 


in part be responsible for the low count for 
1958. Also, Psychological Abstracts appar- 
ently does not cover European literature as 
well as it might. According to Adams (1959) 
roughly only about 30% of the publications 
considered as major contributions to the field 
by Austrian and German psychologists are 
abstracted. 

Who edits the Adstracts can make a differ- 
ence and the editors have changed over the 
W. S. Hunter was the editor up to 
1946. He was followed by C. M. Louttit, who 
was followed by Allen J. Sprow.’ The present 
editor, Horace B. 


years. 


English, is first mentioned 
in the February 1959 issue. 

Difficulties in topic arrangement and changes 
in the number of captions and categories used 
can make a difference. Such changes have 
taken place in the history of the Abstracts. 
Categories have changed. “Industrial and Per- 
sonnel Problems” was used as one inclusive 
category for all articles in the field from the 

1In January 1947, L took Editor 
Sprow was listed as Assistant Editor in the January 
1949 issue. His name reappeared in May 1953 in the 
same capacity. Sprow was listed as Executive Editor 
1957 1959, when English 
was listed as Editor 


yuttit 


over as 


from February to February 





Industrial Psychology in 


nitial issue until August 1947. 


In Septem- 
Personnel 
Business 


ber 1947, two captions appeared 


and “Industrial and 
In October 1947, “Vocational 
was included as a subcaption of 
“Personnel Psychology” so there were really 
three categories. In January 1948, the cate- 
were again changed with “Personnel 
including articles on “Selection 
and Placement” and “Labor-Management Re- 
and with “Industrial and Other Ap- 
plications” including as subcaptions “Indus- 
try,” “Business and Commerce,” and “Profes- 
sions.”” The categories have not been changed 
since that time. 


Psychology” 
Psychology.” 
Guidance” 


gories 


Psychology” 


lations 


ArE Mort RODUCING LEss? 


CONTRIBUTORS P 


What i 


considered in terms of output in relationship 


the picture of productivity when 
to number of potential contributors? If we 
consider Fellows and Members in Division 14 
as potential contributors, we can compare the 
gains made in membership to amount of pro- 
ductivity. In 1948, Division 14 had 108 Fel- 
lows and 78 Associates. In 1958 the Division 
had 250 Fellows and 409 Members, which 
represents a total increase of 354%. In spite 
of this increase in membership and individual 
potential contributors of articles that would 
be listed in the Abstracts, the expected in- 
creases do not appear. In 1948 the approxi- 
mate output per member was 2.45. In 1958 
it was only .60. In the most productive year, 
1955, with 213 Fellows and 301 
the output per member was approximately 


Associates, 


1.51. In the last 10 years membership has in- 
creased almost four-fold while the output for 
even the most productive year has increased 
less than twofold. 

One about the increased 
number of people interested in the field of in- 
dustrial psychology. Many more of them are 
now identified with business or industrial or- 
ganizations, or consulting organizations, than 


thing is known 


ever before. It is probably a fact, as at least 
two of the nine contributors to the industrial 
psychology chapters in the Annual Review 
indicate, that it is not the psychologists in in- 
dustry who are doing the publishing. It is the 
academic people who feel compelled to write 
or who can afford the luxury of time, if not 
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money; that is, the occupational selectivity 


in recent years has changed where the pro- 
used for 
other than publications in psychological jour- 
nals by an ever larger number of people who 
belong to Division 14. This appears to be 
one of the chief reasons for the decline in in- 
dividual productivity. Another may be that 


more of the members are spending their writ- 


ductive energy is being purposes 


ing energies on reports for executives or for 
industrial Also, some of the 
newcomers in the field may be publishing in 
journals that are interdisciplinary in nature. 
There are many more reasons which may de- 
serve investigation, but one thing is certain 
from a study of the data reported in Table 1 

the actual tempo of growth in total out- 


organizations. 


put and decrease in average individual output 
are contradictory to rather than in accord 
with the exaggerated impressions of vast in- 
creases assumed. 


SUMMARY AND IMPLICATIONS 


Because industrial psychology seems to be 
a rapidly growing field, it is assumed that it 
has grown in productivi well as in in- 
interest in the 
field by way of belonging to Division 14 of 
the APA. A survey of the articles appearing 
in Psychological Abstracts from 1927 to 1959 
reveals this assumption to be false. What is 
shown may be summarized as follows: . (a) 
There has been some growth over the years 
and new plateaus have been reached. When 
viewed by the year, the gains made are not 
continuous or sustained. When comparisons 
made are considered in three-year spans, the 
same conclusion is evidenced, except that from 
1942 to 1956 there are continuous 
which dropped in 1957 and 1958 
more so in 1958. 


creased numbers expressing 


gains, 
markedly 
(b) There have been some 
peak years and also some lean years. Which 
years are the years of large gains and which 
are the lean years is reported in the study 
and some of the reasons that might explain 
the ups and downs are suggested. (c) 
sidering the immense increase in the number 
of individuals available for contributing, the 
ratio per individual has become substantially 
smaller. While membership has increased al- 
most four-fold, individual output in the most 


Con- 
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productive of recent years, 1955, is less than 
two-fold. This decrease in individual produc- 
tivity is worth considering. 

What is the meaning and what are the im- 
plications of the foregoing facts? The impres- 
sion of businessmen about the growth of in- 
dustrial psychology is not based on any con- 
cepts of productivity in the field. Instead, the 
opinions they have are seemingly the conse- 
quence of their personal experiences with psy- 
chologists and people who come to their offices 
to sell psychology. 

As far as psychologists who are assuming 
that there is increased productivity are con- 
cerned, the statements are not based on facts 
but on the general impression that with in- 
creased size there must necessarily be in- 
creased productivity. Assuming that the pro- 
ductivity of more relevant and significant, as 
well as comprehensive studies is desired, what 
can be done about it? On the surface, some 
of the suggestions that come to mind in light 
of the findings reported in this study may be 
phrased this way: Get the academic people 
to write less; get the people in industrial set- 
tings who are in a position to have a better 
sense of relevance about human problems in 
industry, as well as organization problems, to 
write more; get the consulting organization 
staffs to be interested in research as well as in 
business for money only. All of these are sug- 
gestions that come to mind. They are not 
likely to occur unless there is some stimula- 
tion or guidance or influence from a profes- 
sional group, which includes people from the 
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three areas—psychologists in industry, psy- 
chologists in consulting organizations, and 
professors of industrial psychology in the uni- 
versities. Increasing fluidity of the boundary 
lines which exist at present between these 
groups can, with the encouragement of a 
widening viewpoint, change the direction as 
well as the extent of productiveness in the 
field. From participants in such work and 
study efforts can be expected a concern with 
a search for fruitful knowledge to apply as 
well as the application of available knowledge. 
A trend toward basic research in industry can 
come about as indicated in a recent article 
(Fisher, 1959) in Science by participation of 
enough people in the three groups whose basic 
motivation is science as well as economics. 
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QUANTITATIVE MOTIVATIONAL DIFFERENCES BETWEEN 
VOLUNTEERS AND NONVOLUNTEERS FOR A 
PSYCHOLOGICAL EXPERIMENT 


EDMUND §& 


Department of Psychiatry 

Several recent papers have attempted to 
demonstrate differential psychological charac- 
teristics of volunteers versus nonvolunteers for 
psychological experiments. Attention has most 
naturally and commonly been given to anx- 
iety, which one would spontaneously expect to 
play a demonstrably significant role in the 
volunteer act. But anxiety has indeed shown 
surprisingly neutral and equivocal predictive 
power. 

In a study requesting some Ss to volunteer 
for ‘‘a personality experiment,” and other Ss 
voluntarily to take the MMPI, Rosen rather 
generally concluded that “volunteers showed 
a greater tendency than nonvolunteers to ad- 
mission of discouragements, 
adequacies and 
ward defensiveness”’ 
(1956) 


and in- 

some tendency to- 
(1951, p. 192). Siegman 
found no difference on the Taylor 
MAS, among other personality measures, be- 
tween volunteers and nonvolunteers for a 
Kinsey type of interview, although he did 
find isolated attitudinal differences. Himel- 
stein (1956) also found no difference in MAS 
scores between volunteers and nonvolunteers 
for an experiment on “personality,” although 
the nonvolunteers showed a _ numerically 
slightly higher score. He concluded that 
“level of anxiety, to the extent that this fac- 
tor is measured by the MAS, is not a major 
variable in the act of volunteering’ (1956, 
p. 136). Following a request for volunteers 


anxieties, 


for any one of four experiments (“Learning,” 
“Personality,” “Attitude to Sex,” and “Hyp- 
nosis”) Martin and Marcuse (1958) 
MAS other 
between volunteers and nonvolun- 
teers; but the authors state that the “anxiety 


found 


no differences on the (among 


measures ) 


score, however, was suggestive [of signifi- 


cance] in the experiment dealing with per- 


sonality” (1958, p. 477). 


and Sakoda (1952) found that volunteers for 


Finally, Masiow 


a Kinsey study were higher in self-esteem 
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of Marvland School of Medicine 
than nonvolunteers; and they also suggested 
that Ss with extreme scores were volunteers, 
rather than nonvolunteers. It is tempting to 
conclude from the results cited, as did Himel- 
stein (1956) in his study, that anxiety, at 
least as it is assessed by the Taylor MAS, is 
not an important variable in the volunteer 
act. The following argument is presented as 
an alternative to this conclusion. 

Whether anxiety plays a role in volunteer- 
ing behavior will depend, at least in part, 
upon the nature of the stimulus—the “Re- 
quest for Volunteers.” If the “experiment” in 
which S is invited to participate be poten- 
tially but clearly ego-threatening, or symboli- 
cally injurious or painful, then one would in- 
deed be on very safe ground in doggedly in- 
sisting that an adequate anxiety measure must 
show that more low- than high-anxious Ss 
volunteer. If, on the other hand, the “experi- 
ment” is clearly completely innocuous, then 
one would stick less doggedly to the anxiety 
hypothesis in the face of negative results, and 
perhaps even abandon it.’ In the conven- 
tinal Request for Volunteers, however, the 
experiment” has been simply labeled as con- 
cerning “Learning,” or “Personality,” and so 
forth. Although such labels sound innocuous 
(and indeed they probably are), there is no 
way of knowing what type of motive (ap- 
proach or avoidance) they arouse in S. Conse- 
quently, the resultant measure of relationship 
obtained between personality characteristics 
and the volunteering act has been potentially 


‘Under conditions where the 


promised to be 


volunteering act 
a source of social approval or in- 
stead a means of avoiding an undesirable, anxiety- 
evoking alternative (e.g., Blake, Berkowitz, Bellamy, 
& Mouton, 1956) one might conceivably draw the 
opposite anxiety hypothesis: that more high- than 
low-anxious Ss would volunteer. In the interests of 
space and clarity, however, the argument presented 
above is restricted to contexts in which anxiety is 
more likely to lead to avoidance than to approach 
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due to two confounded variables: (@) whether 
the “experimental topic” as described arouses 
a threat, and (4) whether predispositional 
characteristics such as anxiety may effect re- 
sponsiveness to this threat. For purposes of 
experimental scrutiny the confounding effect 
may be partly controlled by making the mo- 
tivational appeals in the Request highly spe- 
cific and explicit, and further controlled by 
treating “strength of threat” as an independ- 
ent variable. The study reported here con- 
tains such controls. 

The experiment attempted to involve two 
rather specific motivations through the use of 
clearly defined cues in the Request for Volun- 
teers. One motivation was the need for cash 
(an approach motive); the other, fear of, or 
anxiety about electric shock (an avoidance 
motive). It was anticipated: (a) that were 
it possible to arouse differential strengths of 
threat via the Request, then more Ss would 
volunteer under a Weak Threat condition, 
compared with a Strong Threat condition; 
and (’) that volunteers would show relatively 
strong approach and relatively weak avoid- 
ance, in comparison with nonvolunteers. In- 
dependent though crude measures of the 
strength of approach and of avoidance were 
obtained. The relationship of the avoidance 
measure to anxiety was assayed with three 
anxiety scales; two of which were conven- 
tional, the third having been specially con- 
structed and incorporated because of its likely 
relevance to the specific avoidance motive in- 
volved. 


METHOD 


Subjects. Two introductory psychology classes each 
of 89 students at Brooklyn College = meeting at con- 
secutive hours on the same morning, 

Materials and Procedure. Three weeks before the 
experiment proper all Ss completed the 20-item short 
form of the Taylor (1953) Manifest Anxiety Scale 
(here referred to as the SMAS) previously shown to 
give results comparable to the 50-item form (Bendig, 
1956); the Christie and Budnitzky (1957) Short 
Forced-Choice Anxiety Scale (here referred to as the 
SFCAS); and a 20-item n Harmavoidance Scale 
The n Harmavoidance Scale consisted of 9 of the 
original 10 items employed by Murray (1938, p. 199) 


were used as Ss 


2Sincere appreciation is acknowledged to Evelyn 
Raskin, who kindly permitted her students to com- 
plete both phases of this research. 
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to which the author added another 11 items. Mur- 
ray equated n Harmavoidance with the tendency to 
fear and avoid stimuli symbolizing pain, injury, dis- 
figurement, illness or death; and it was thus antici- 
pated that such a scale may be more likely than the 
SMAS or the SFCAS to discriminate measures of 
anxiety about electric shock (ie., the tendency to 
avoid the “experiment”). The three questionnaires 
were distributed by the class instructor, and were 
filled out under surveillance single class 
period.* 

On the day of the experiment E was introduced at 
the beginning of the hour. He said simply that he 
was doing some research on the Brooklyn campus, 
and that he was about to give the students a chance 
to earn some extra cash. A mimeographed Request 
for Volunteers then distributed. Although the 
Request cannot be reproduced here, it may be as- 
sumed to have appeared highly credible to at least 
95% of Ss 

The Request made reference to research work un 
der way at Brooklyn, describing it as a study of 
“human visual and muscular skills . under vari- 
ous conditions of disruption and interference—some- 
thing like the ‘kind of thing you must have played 
with in an amusement park.’”’ It asserted that volun 
teers would receive $3.00 in cash for their services; 
and that the experiment would involve an electric 
shock. For one class the shock was described as ex- 
tremely weak; for the other class, as moderate-to- 
strong. Both groups of Ss were informed in the Re- 
quest that the “experiment” would be administered 
during the class period; and that at the proper time 
the research assistants would call for volunteers in 
a group. It was also stated that the instructor would 
permit such an arrangement without loss of credit or 
attendance. All Ss were then asked to check whether 
they would prefer to attend immediately (i.e., in one 
half-hour), their attendance for 7 
14, or 21 days 

At the back of the Request, ostensibly 
to it, was appended 
naire.” The preface to the questionnaire stated that 
“There is an ever-increasing amount of experimenta- 
tion going on these days involving shock, stress, et« 
and the problem of getting enough students suffi 
ciently motivated to serve as paid experimental sub 
jects is a most important one. It would be of con- 
siderable help to us in future requests for subjects 
if, while we have your attention, you would kindly 
let us have your opinions about the following ques- 
tions. Your opinions will be most valuable. Please 
give us your opinions whether you volunteered for 
the experiment or not.” The questionnaire contained, 
inter alia, items each measured on a 
five-point scale, dealing with S’s “anxiety about elec 
tric shock,” “fear of pain, sting, or burn from elec 


during a 


was 


or to postpone 
unrelated 
a “Student Opinion Question 


three critical 


>A copy of 
quest 

+The statistical characeristics of these 
well as the finer details of their administration are 
presented elsewhere (Howe & Silverstein, 1959). 


this scale may be obtained upon re- 


scales as 
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tric shock,” and “fear of injury from electric shock.’ 
The sum of S’s scores of these items shall be re- 
ferred to as “n Shockavoidance” and assumed to be 
a rather coarse measure of S’s aversive feeling about, 
and hence his tendency to avoid electric shock. Three 
other critical five-point scale items likewise dealt 
with S’s present and future needs for cash. The sum 
of these scores was used as a coarse measure which 
shall be called “n Cash,” assumed to be partly re- 
lated to S’s tendency to approach the experimental 
situation. After the forms had been collected the real 
nature of the experiment was described to the class 
by £. A show of hands indicated that few Ss had 
failed to take the entire procedure perfectly seriously 


RESULTS 

Hypothesis 1 predicted that were the two 
shock conditions differentially threatening, 
then more Ss would volunteer from the Weak 
Shock compared with the Strong Shock group. 
A total of 69/89 Ss volunteered from the 
Weak Shock group, while 60/89 Ss did so in 
the Strong Shock group. The difference is in 
the expected direction, but not significantly 
so (x? = 2.281; with df = 1, 10< P< .20). 
It must therefore be assumed that the two 
Shock conditions did not arouse differential 
degrees of threat. 

Combining Weak and Strong Shock groups, 
therefore, and breaking down Ss into two 
groups on the basis of sex, it was established 
that 70/105 female Ss volunteered, while 
59/73 male Ss did so. The larger ratio of 
conven- 
31; with df=1, P: 


male volunteers is significant at a 
3 


tional level (y° = 4. 
05). 

Through carelessness or otherwise, however, 
23 volunteers and seven nonvolunteers did not 
complete all three anxiety scales in an ac- 
ceptable manner. After ascertaining that noth- 
ing would be statistically lost or biased by 
omission of these Ss from all subsequent com- 
putation, they were removed from their re- 
spective samples. 

All subsequent statistical treatments were 
thus based upon 49 volunteers and 15 non- 
volunteers in the Weak Shock group; and 
upon 57 volunteers and 27 nonvolunteers in 
the Strong Shock group. Scores were com- 
puted for each S on the single approach vari 
able (n Cash) ; on the two avoidance variables 
(n Shockavoidance and n Harmavoidance) ; 
and on the SMAS and the SFCAS. Since 
“anxiety” is explicitly hypothesized in this 
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paper to serve as a source of avoidance, the 
last four variables will, however, be collec- 
tively referred to as “avoidance variables.’ 

A comparison was first made of Weak 
Shock versus Strong Shock Ss on the single 
approach variable and on the four avoidance 
variables. A Weak-Strong difference occurred 
only on the n Shockavoidance measure (t = 
3.111; with df = 146, P < .01),° the Strong 
Shock Ss showing higher scores. This differ- 
ence holds up for volunteer Ss alone (¢ = 
2.654; with df = 104, P < .01), considered 
separately from nonvolunteers (¢t = 1.417; 
with df = 40, P > .05). The discrepancy be- 
tween these results, and the finding (reported 
earlier) that the Weak-Strong difference in 
proportions of volunteers is not significant, 
suggests that the n Shockavoidance measure 
is not necessarily completely independent of 
the stimulus condition under which a response 
is made to it. In presenting results concerned 
with the n Shockavoidance variable, there- 
fore, a distinction will be maintained between 
Weak and Strong Shock groups (see Table 1). 
This distinction will not be made for data 
concerning the other four motivational vari- 
ables. 

A comparison was next made of Male versus 
Female differences on all five motivational 
variables. There were virtually zero sex dif- 
ferences on the SMAS and SFCAS scores. Fe- 
male Ss showed higher scores, however, on the 
n Shockavoidance variable (¢ = 2.450; with 
df = 146, P < .02) and on the n Harmavoid- 
ance variable (¢ = 4.366; with df = 146, P 
< .001). Male Ss, on the other hand, scored 
higher than female Ss on the n Cash variable 
(¢ = 2.664; with df = 146, P < .01). Pres- 
entation of data concerning these three “need” 
variables will therefore maintain a distinction 
between male and female Ss. This is especially 
desirable since there is a borderline sex differ- 
ence in frequency of the volunteering act. 
This distinction will not be 
SMAS and SFCAS data. 

Table 1 presents data® relevant to Hy- 


observed for 


5 All values of P presented in this paper are con 
servatively based upon two-tailed tests. 
can be shown that the values of ¢ cited in 
1 tend to be slightly in excess of those based 
point-biserial correlation coefficient. If N 
3 i and if the variance for volunteers and non- 
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rABLE 1 


OF VOLUNTEERS 
VARIABLE AND 


AND 
FOUR 


Motivational Subject 
group 


variable 


Male 


Female 


1 Cash 


Combined 


Mal 
avoidance Female 


Combined 
Male 
Female 
Weak shock 


Strong shock 


Shock 


avoidance 


Combined 
Combined 


Combined 


pothesis 2. This hypothesis, it will be re- 
called, predicted volunteers to show stronger 
approach and weaker avoidance, in compari- 
son with nonvolunteers. With regard first to 
the approach variable (n Cash) it may be ob- 
served that in accordance with the hypothesis, 
this source of motivation differentiates volun- 
teers from nonvolunteers. Among the avoid- 
ance motivational variables only n Harm- 
avoidance and n Shockavoidance significantly 
discriminate between the two types of Ss. 
Lower n Harmavoidance scores occur, as hy- 


TABLE 2 
AMONG THI 
AVOIDANCI 


148 


INTERCORRELATIONS Four 
MEASURES OF 


(N = 


n Harm 


avoidance SMAS 


SFCAS .18* 
1 Shockavoidance 


Harmavoidance 


146 


disc 


small 


the between th¢ 


t will be very 


volunteers do not differ, repancy 


two values oO! 


NONVOLUNTEEI! ON 
\VOIDANCI 


ON! 
VARIABLES 


APPROACH 


Mear 


10.28 
9.12 
9.66 


8.18 


10.79 


pothesized, among the volunteers. This dif- 
fevence holds up when females are considered 
separately; but when males are so treated, 
the difference falls short of two-tailed signifi- 
cance. Lower n Shockavoidance scores also, 
occur among all volunteer versus all nonvolun- 
teer Ss. The difference holds up for the Strong 
Shock group, but not for the Weak Shock 
group. The difference further holds up for 
male volunteers, but falls short of two-tailed 
significance for female volunteers. 

The foregoing results imply that volunteers 
tend to be low, rather than high in avoidance 
motivation. On the other hand, the very small 
values of ¢ for volunteers versus nonvolun- 
teers on both the SMAS and the SFCAS leave 
no doubt whatsoever that “manifest anxiety,” 
as it is assessed by these two scales, played 
no role in the volunteering act. 

While the n Harmavoidance and n Shock- 
avoidance Scale are the only two out of four 
to show the predicted discrimination among 
Ss, these scales are not, however, unrelated to 
the SMAS and the SFCAS. Table 2 shows the 
intercorrelations among the four scales. It 
may be seen that small, but nevertheless just 
the 
SFCAS, and both n Shockavoidance and n 


significant correlations occur between 
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Furthermore, the SMAS 
shows a significant relationship to n Shock- 
avoidance. These findings suggest that in the 
situation studied, “anxiety” may have played 
a role in the volunteering act, but that such 
“anxiety” is relatively bound, and perhaps 
aroused only by rather specific contexts. 


Harmavoidance. 


DISCUSSION 


Three other studies had previously failed 
any difference on SMAS or MAS 
scores between volunteers and nonvolunteers. 
The present study likewise failed to show 
such a difference, even though the Request 
for Volunteers contained at least one explicitly 
defined cue which ene would have expected to 
evoke avoidance tendencies (i.e., nonvolun- 
teering) in high-anxious Ss. The overall nega- 
tive findings among the four studies consti- 
tute a sufficiently consistent trend to call for 
a descriptive hypothesis. 

It should be borne in mind that as Taylor 
(1956) has hypothesized, and as Mednick 
(1957) has recently affirmed, “a high MAS 
score predicts that anxiety may be [italics 
added} elicited by ‘stress situations’ and is 
not a chronic state which would manifest it- 
self in any circumstance” (1958, p. 493). 
While the process of an experiment (involv- 
ing, say, electric shock) for which S is re- 
quested to volunteer may well constitute a 
situation,” the act of volunteering 
per se clearly does not do so. To the extent 
that the volunteering act is in fact but a 
preparatory, commitment response toward an 
event not yet present, such a response will 
therefore involve anticipatory mediating reac- 
tions which are relatively weak compared with 
those that arise in the face of the event itself. 
The failure in the present context of the 
SMAS (and the SFCAS) to differentiate 
volunteer Ss from other Ss may thus be de- 
duced from the proposition that ‘reactivity 
at the level of commitment-making behavior 
to threat of electric shock bears little rela- 
tionship to predispositional manifest anxiety 
as it is presently assayed.” Such an hypothe- 
sis is necessary on the one hand because of 
the negative findings reported above; and on 
the other hand because of relationships re- 
peatedly demonstrated in “live” condition- 
ing situations, between manifest anxiety and 


to show 


“stress 
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strength of threat of electric 
shock (or other aversive stimulus). 

This raises the question why the n Harm- 
avoidance Scale in particular predicts the 
volunteering act when the conventional scale 
has failed. The findings of this experiment 
imply that the n Harmavoidance Scale taps 
S’s tendency to anticipate potential danger or 
harm in situations yet to come. Indeed, Howe 
(1959) has elsewhere shown that among the 
“volunteers” for the “experiment,” Ss prefer- 
ring to postpone their attendance, rather than 
attend immediately, are more “anxious” (P < 
.02) as judged by n Harmavoidance meas- 
ures, but no different (P = .50) as judged by 
the SMAS measures. While these data are not 
offered as acceptable validity data they do 
suggest that, beyond the volunteering situa- 
tion as such, the n Harmavoidance Scale may 
indeed be sampling some avoidance motiva- 
tional component untouched by the (S) MAS. 

This line of argument still leaves open the 
possibility that the stimuli contained in the 
Requesis for Volunteers aroused a rather spe- 
cific, bound anxiety which—also at the level 
of commitment making—bears but a low rela- 
tionship to general manifest anxiety. The find- 
ings reported here are consistent also with this 
possibility. It remains to be seen whether 
comparable studies with other contexts vield 
similar results. 


response to 


SUMMARY 

Two classes each of 89 students were given 
the short form of the Taylor MAS (the 
SMAS), the Christie and Budnitzky Short 
Forced-Choice Anxiety Scale (the SFCAS), 
and a 20-item scale purporting to assess Mur- 
ray’s n Harmavoidance. Two weeks later all 
Ss were invited, in a highly credible printed 
Request, to commit themselves to participate 
on any one out of the four subsequent days, 
in an “experiment.” It was asserted that S 
would be paid $3.00 for his cooperation. Fur- 
ther, one class was informed that the “experi- 
ment” would involve an extremely weak elec- 
tric shock, and the other class a moderate- 
to-strong electric shock. Measures were then 
obtained from each S of the strength of his 
immediate need for extra cash (n Cash), and 
of his fears about electric shock (n Shock- 
avoidance). 
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It was argued that explicit threat of elec- 
tric shock would arouse anxiety, and that the 
stronger the threat, the more likely would it 
motivate avoidance nonvolunteering). 
was predicted: (a) that 

from the Weak 
the Strong Shock 


(i.e., 
Consequently, it 
Ss 


group, 


more would volunteer 


Shock than from 


(6) that volunteers would show 
stronger approach and weaker avoidance than 
nonvolunteers. 

More Ss volunteered from the Weak Shock 
group, but not significantly so; and there were 


group; and 


significantly more males than females. Volun- 
teers were compared with nonvolunteers on all 
Volunteers showed 
Cash; significantly 
and _ significantly 
lower n Harmavoidance. As in other reported 
studies of this kind, the SMAS did not at all 
discriminate volunteer from other Ss; 
fact did the SFCAS. 

Relationships among avoidance variables, a 


five variables mentioned. 


significantly higher n 


lower n Shockavoidance: 


nor in 


methodological issue, and the repeated nega- 
tive findings with regard to the (S) MAS were 
discussed. 
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INDUSTRIAL 


SALESMEN AND RETAIL SALESMEN 
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Many standard correlational studies have 
been conducted in an effort to measure char- 
acteristics associated with selling effectiveness. 
Typically, these studies have used ratings or 
rankings as criteria of selling effectiveness and 
various psychological test scores have been 
correlated with the ratings. Implicit in many 
of these studies has been the assumption that 
the same pattern of personal characteristics 
may be associated with top-notch selling re- 
gardless of the products sold, the kind of peo- 
ple called on, or other differentiating charac- 
teristics. The rather negative results obtained 
in many such studies stem in part, certainly, 
from the fact that the foregoing assumption is 
probably not valid. In other words, many in- 
vestigators have run afoul of the easy tend- 
ency to compare with another 
who may be assigned to essentially different 


persons one 
jobs involving different duties and responsi- 
bilities. As a result, the criterion ratings have 
suffered and often have been less valid esti- 
mates of selling effectiveness than they might 
have been. Also, the expectation that the same 
or similar traits might predict effectiveness in 
all selling jobs has not been realized; con- 
tamination has been introduced into not only 
the criterion but also the predictor. 

\ recent study by Witkin (1956) points up 
differences in the vocational interest patterns 
of salesmen engaged in different kinds of sell- 
ing. Statistically significant differences were 
obtained on several scales of the Strong Vo- 
cational Interest Blank (SVIB) by Specialty 
Salesmen, Route Salesmen, and Sales Engi- 
neers. The results of this study combined with 
he considerations discussed above led us to 
recognize the need for more careful job analy- 


ses of different selling jobs and the need to 
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Manufacturing Company 


validate psychological measures separately for 
different sales groups. 

This approach is particularly necessary for 
Minnesota Mining and Manufacturing Com- 
pany (3M) where the diversity of products 
makes necessary the employment of at least 
19 separate sales forces, each responsible for 
certain products and for calling on certain 
kinds of customers. This article reports differ- 
ences obtained on various psychological tests 
by two major groups of 3M salesmen and re- 
ports also relations between these tests and 
managers’ estimates of the relative effective- 


ness of salesmen in each of the two groups 


METHOD 


A Sales Job Description Checklist (SJDC) 
was developed by Dunnette and Kirchner 
(1959) to be used as an objective guide for 
analyzing and describing different selling jobs 
The checklist sales activities 
which are ranked by a respondent in order of 
their overall importance to successful per- 
formance of his job. The checklist can be 
scored for three major factors which have 
been identified by cluster analysis. These fac- 
tors are Retail Selling, Industrial Selling, and 
Selling in General. The Retail Sales cluster is 
defined by activities such as: canvassing store- 
to-store, calling directly on retail dealers, en- 
gaging in promotional work such as advertis- 
ing, and checking on movement of customers’ 
stocks and replenishing when necessary and 
appropriate. The Industrial Sales cluster is 


consists of 35 


defined by activities such as: calling directly 
on industrial firms, calling directly on profes- 
sional and technical persons, giving technical 
or scientific advice to customers concerning 
use of your product, and working with cus- 
tomers on special problems concerning product 
uses. 

Responses of 3M salesmen to the SJDC 
were used to identify a 3M Retail sales group 


and a 3M Industrial sales group. Both groups 
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included men who had been in their present 
jobs for five years or more, thus assuring a 
relative degree of success and satisfaction and 
certainly persistence in their jobs. Salesmen 
were assigned to the Retail group if they 
scored in the upper one-third on the Retail 
scoring key of the SJDC and at least 30 per- 
centile points higher on the Retail than on the 
Industrial key of the SJDC. In like manner, 
salesmen were assigned to the Industrial group 
if they scored in the upper one-third on the 
Industrial scoring key of the SJDC and at 
least 30 percentile points higher on the In- 
dustrial than on the Retail key of the SJDC. 

These techniques resulted in identification 
of 50 Retail salesmen and 70 Industrial sales- 
men. Previously, all 3M salesmen had been 
asked to participate in a research study di- 
rected at validating psychological tests against 
3M selling effectiveness. Participation involved 
taking a variety of psychological tests includ- 
ing Part I—Verbal of the Wesman Personnel 
Classification Test, the Strong Vocational In- 
terest Blank, and the Edwards Personal Pref- 
erence Schedule. In addition, each salesman 
completed an Adjective Checklist (Kirchner 
& Dunnette, 1958, 1959) developed by Dun- 
nette. Sales managers throughout the com- 
pany were asked to rank salesmen reporting 
to them with respect to overall selling effec- 
tiveness. These rankings were converted to 
stanine scores and used as estimates of each 
salesman’s job effectiveness. In six instances, 
rankings were obtained from two managers 
independently rating the same group of men. 
The coefficients of correlation between the sets 
of rankings ranged from .50 to .80 with a 
median value of .74. 

Psychological test information and Adjec- 
tive Checklist responses were compared for 
the two sales groups. Comparisons between 
test scores and Stanine Effectiveness Scores 
also were made separately for each of the two 
groups. 


RESULTS 


Wesman Personnel Classification Test. Mean 
scores on Part I of the Wesman test for the 
Retail and Industrial groups are 30.17 and 
30.98, respectively. This difference is not sig- 


nificant statistically. The mean scores do, 


and W. K. Kirchner 


however, show both groups to possess superior 
verbal reasoning ability (e.g., a score of 31 is 
at the 80th percentile for Wesman’s norm 
group of junior executives). 

Managers’ rankings of effectiveness (Stanine 
Scores) were available for 61 of the Retail 
salesmen and 42 of the Industrial salesmen. 
Verbal reasoning ability, as measured by the 
Wesman test, is moderately related (r = + 
.35)* to rankings of Industrial salesmen’s ef- 
fectiveness but independent (r= + .01) of 
rankings of Retail salesmen’s effectiveness. 

Strong Vocational Interest Blank. Means, 
standard deviations, and ¢ values were com- 
puted for 3M Retail and Industrial salesmen 
on the 48 SVIB scales. The mean differences 
between the two groups are statistically sig- 
nificant on 18 scales. The scales on which In- 
dustrial salesmen score significantly higher are 
listed below: 


Psychologist (17 
Mathematician (5.5, 8.9 
Physicist (3.3, 7.2) 
Engineer (22.1, 27.0) 
Chemist (17.4, 21.1) 
Production Manager (36.1, 40.1) 
Aviator (34.2, 37.9) 
Mathematics, Physical Science Teacher 
(32.9, 35.8) 
Industrial Arts Teacher (14.9, 20.7) 
Public Administrator (38.6, 41.5) 
Senior CPA (35.1, 38.7) 
Masculinity-Femininity (48.4, 52.4) 
The scales on which Retail salesmen score 
significantly higher are listed below: 


Purchasing Agent (36.8, 34.4) 
Mortician (43.4, 40.1) 

Pharmacist (38.9, 35.4) 

Real Estate Salesman (49.9, 47.2) 
Sales Manager (50.7, 47.6) 

Life Insurance Salesman (50.2, 45.2) 

These results agree well with expectations. 
Industrial salesmen show somewhat greater in- 
terest in scientific and technical pursuits, in 
computational activities, and possess more 
masculine interests than their Retail counter- 
parts. Retail salesmen show greater interest in 


>For N = 42, a correlation coefficient of .35 is sta- 
tistically significant at the .05 level 
3 The numbers in parentheses are the mean SVIB 


T scores obtained by Retail and Industrial salesmen, 


respectively. 
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independent business pursuits and actual per- 
suasive activities than Industrial salesmen. It 
should be noted, of course, that the two groups 
possess essentially similar vocational interests 
directed primarily toward sales, business, and 
personal contact occupations. The differences 
discovered between the groups suggest simply 
that Retail salesmen are somewhat more ex- 
treme in their vocational likes and dislikes. 
They show stronger, more marked patterns in 
sales and persuasive areas and more distinct 
rejection of scientific and problem solving 
pursuits than do Industrial salesmen. 

Managers’ rankings of were 
available for 61 of the Retail salesmen and 
42 of the Industrial salesmen. Correlation co- 
efficients were computed between SVIB scales 
scores and rankings of effectiveness for each 
of the two groups. For Industrial salesmen, 
significant correlations were obtained on the 
following scales: 


effectiveness 


Engineer +.28 

Production Manager 
YMCA Secretary 29 
Social Science Teacher —.27 
Banker —.26 


salesmen, significant correlations 
obtained on the following scales: 


Retail 


Aviator 21 

Printer 21 

Math. and Phys. Science Teacher 
Industrial Arts Teacher 26 
Forest Service Man 

Accountant —.22 

Sales Manager +.26 

Life Insurance Salesman 


These results are not impressive in terms of 
practical predictive utility; yet they do point 
up the necessity for considering the groups 
separately. The patterns of significant corre- 
lations differ substantially between the two 
groups, 


and the differences are in accord with 
previously presented information concerning 
differences between Retail and Industrial 
salesmen. For example, significant positive 
predictions with Industrial selling effective- 
ness are shown by the Engineer and Produc- 
tion Manager keys, whereas significant posi- 
tive predictions for Retail selling effectiveness 
are shown by the Sales Manager and Life In- 
surance salesmen keys. It is apparent that 


effectiveness in these two different 3M sales 
jobs calls for distinctly different patterns of 
vocational interest. 

Edwards Personal Preference Schedule. 
Means, standard deviations, and ¢ values were 
computed for 3M Retail and Industrial sales- 
men on the 15 EPPS scales. Only two of the 
differences were statistically significant. Re- 
tail salesmen score higher on the Orderliness 
scale and lower on the Affiliation scale than 
Industrial salesmen. 

Correlations between the EPPS scales and 
managers’ rankings of effectiveness for Retail 
and Industrial salesmen also were computed. 
Only four of the correlations are statistically 
significant (three would have been expected 
to be significant at the 10% level simply by 
chance). It does appear, however, that Domi- 
nance may be related to overall effectiveness 
in both Retail and Industrial selling 
+.32 and +.29 respectively). 

Adjective Checklist. The Adjective Check- 
list consists of 


(r’s 


36 groups of five adjectives. 
In developing the checklist, adjectives were 
placed together which are approximately equal 
in social desirability. A respondent, in com- 
pleting the checklist, is asked to choose, from 
each group, the adjective he regards as most 
descriptive of himself and the adjective he 
regards as least descriptive of himself. Re- 
sponses made on the checklist by 3M Indus- 
trial and 3M Retail salesmen were compared 
and the differences tested for statistical sig- 
nificance. Results are shown in Table 1. 

It may be noted that Retail salesmen are 
more likely than Industrial salesmen to de- 
scribe themselves as rather intense, active, 
planful, painstaking, hard working, ambitious, 
impatient, nonscientific and noninventive. In- 
dustrial salesmen, on the other hand, are more 
likely than Retail salesmen to describe them- 
selves as fairly relaxed, rather shrewd and re- 
sourceful, patient and tolerant, yet wary and 
complicated, and possibly lacking in thorough- 
ness, organization, and even ambition. It ap- 
pears that the Industrial salesman perceives 
himself as depending more on his wits while 
the Retail salesman perceives himself as work- 
ing harder and organizing more skillfully to 
achieve success in selling 


Comparisons between item responses and 
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TABLE 1 


ADJECTIVES THAT DIFFERENTIATE SIGNIFICANTLY BETWEEN 3M ReETAII 
AND 3M INpDusTRIAL SALESMEN 


More Often Picked 
by Retail Salesmen 
as Most Descriptive 


More Often Picked 
by Industrial Salesmen 
as Most Descriptive 


More Often Picked 
by Retail Salesmen 
as Least Descriptive 


More Often Picked 
by Industrial Salesmen 
as Least Descriptive 


Significant 
at 1% level 
Significant 


at 5% level 


Significant 
at 10% level 


Hard Working 


Deliberate 
Changeable 


Ambitious 


Directive 
Affectionate 
Interests Wide 


Resourceful 


Mechanically Inclined 
Patient 

Tolerant 

Foresighted 
Cooperative 
Reasonable 

Wary 

Complicated 
Imaginative 


Relaxed 


Mechanically Inclined 
Praising 


Inventive 
Ingenous 
Patient 
Tolerant 
Opinionated 


Scientific 
( ‘omplimentary 


Relaxed 


Conservative 
Thorough 


Organized 
Dignified 
Active 
Witty 
Ambitious 


A ffectionate 
Vindictive 


Dignified 
Painstaking 
Cheerful 
Quick 
Honest 
Careful 


sales effectiveness ratings were not made sepa- 
rately for the two groups. This is because 
single-item responses are potentially much less 
reliable than the scale scores which were used 
in comparing the SVIB and EPPS measures 
with ratings of effectiveness. 
DISCUSSION 

Results reported here constitute, in a sense, 
a psychological job analysis of 3M Retail sell- 
ing and 3M [Industrial selling. Comparisons 
made between the two groups and between 
test scores and ratings of selling efiectiveness 
pinpoint traits characteristic of and impor- 
tant to success in these two kinds of selling. 

For example, it is now possible to write a 
thumbnail sketch of the typical successful In- 
dustrial or Retail salesman. The Industrial 
salesman places heavy emphasis on ingenuity, 
inventiveness, and the exercise of his wits in 
his job. This is shown not only by his self- 
description (via the Adjective Checklist) but 
also by his moderate degree of interest in sci- 
entific, technical, and problem solving activi- 
ties on the SVIB. Finally, of course, it is sug- 
gested by the fact that success in his indus- 


Dominant 


trial selling job is related to the level of his 
verbal reasoning ability as measured by the 
Wesman test. 

The Retail salesman places heavy emphasis 
on planning, hard work, and persuading other 
people of his point of view or way of doing 
things. Again, this is shown not only by his 
self-description but also by his higher degree 
of orderliness on the EPPS, by his rather 
marked (and narrow) pattern of vocational 
interests in the selling and independent busi- 
ness occupations and by his striking rejection 
of such “thinking” jobs as those related to the 
technical or physical sciences. Success for the 
Retail salesman is predicted not by a measure 
of reasoning ability but rather by the level of 
his motivation toward selling (as measured by 
SVIB sales keys) and toward gaining a domi- 
nent position in interpersonal relationships. 

This analysis has aided us in understanding 
and identifying attributes of successful sales- 
men in different sales jobs, selling different 
products, and calling on various types of 3M 
customers. Our analysis was, however, limited 
to the cross sectional study of sales success 
and its correlates. Research now underway 
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will study sales success longitudinally in an 
effort to describe more fully the contrasting 
, psychological test characteristics of Industrial 
and Retail salesmen. 
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Perplexing to many a psychologist has been 
the determination of the exact interrelation- 
ships among ability, interest, and aptitude 
(Strong, 1955). It is true that there exist 
formal definitions of each but, even so, we 
still have much to learn about the ways they 
interact and effect, or affect, behavior. Does 
a person, first, have ability and then develop 
interest? Or does he, first, have interest and 
then develop ability? Or does he, first, have 
aptitude, and then develop interest and/or 
ability ? 

Can ability be defined in terms of aptitude 
and interest? Or can we define interest in 
terms of aptitude and ability? Or, aptitude in 
terms of ability and interest? It is the very 
last of these interpretations which will be the 
burden of this article: that aptitude be de- 
fined as a joint function of ability and in 
terest. 

To demonstrate the logic which leads to 
this suggestion, frequent reference will be 
made to Fig. 1. This figure, constructed in 
accord with the convention that correlation 
can be represented geometrically by the cosine 
of an angle, shows a set of relations which 
grew out of and, when made explicit, became 


a controlling guide in a selection research pro- 
gram during a recent five-year period, 1954 
to 1959. 


Fig. 1. Note first 
iis two principal dimensions: termination/sur- 
vival and production (the y 
axis). The latter has two subdivisions, high 
and low; the former has four, survival for less 


Concentrate, now, on 


(the x axis); 


than one year (zero), or for one, two, or three 
years, respectively. 

Second, note on the left, spread along the 
y axis, the words “personal history.” From 
such we are going to show that production 


This paper is a slightly revised version of one pre- 
sented during a Research Planning Conference spon- 
sored by the Life Insurance Agency Management As- 
Hartford, Connecticut, June 


sociation, and held in 


1-2, 1959 


ociation 


(of debit life 
predicted. 

Third, note on the bottom, spread along 
the x axis, the word “interest.”” From such we 
are going to show that termination/survival 
can be predicted. 


insurance salesmen)? can be 


Fourth, note (diagonally across part of the 
figure) the letters AI (for Aptitude Index). 
In terms of this vector we shall explain the 
ways in which validation of the Aptitude In- 
dex (a test for the selection of ordinary life 
insurance salesmen) has been different from 
that of the Combination Inventory (a test for 
the selection of debit life insurance salesmen). 

This latter test, available only since 1954, 
consists of five separate sections: arithmetic, 
mental alertness, vocational interest, person- 
ality (used to measure social desirability), 
and personal history. The former, i.e., the 
Aptitude Index, available since 1938, contains 
only two sections (personality and personal 
history) scored in such manner, however, as 
to yield for interpretation and use only one 
score. This, in contrast with five scores, 
by the Combination Inventory. 


given 


APTITUDE INDEX VECTORS 


Let’s next devote our attention to that part 
of Fig. 1 represented by Columns Zero and 
One. The second of these columns pertains to 
first-year survivors; the former to first-year 
terminators. These columns, considered with 
the production subdivisions (along the y 
axis), produce a rectangle with quadrants 
containing: I. First-year survivors who are 
high producers; II. First-year survivors who 
are low producers; III. First-year finals who 
are low producers; IV. First-year finals who 
are high producers. From data furnished by 


Sweeney * for a group of 500 ordinary agents 


Men who sell insurance on a weekly and/or 
monthly premium payment basis as well as upon a 
quarterly, semiannual, or annual premium payment 
basis and who also collect most of such payments via 
personal calls at the policyowner’s home 


Sweeney, E. J. Personal communication 
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we judge that no more than 5% of a typical 
group of life insurance salesman recruits fall 
into Quadrant IV; 45% into Quadrant III; 
and 25% into each of Quadrants II and I. 
Utilizing these percentages to determine the 
appropriate centers of gravity, certain vec- 
tors, soon to be mentioned, have been drawn 
to represent the correlations and continua 
involved. 

As is well known, Peterson, as reported by 
Paterson (1953), following Kurtz (1946) in 
validating the Aptitude Index, has been most 
explicit in defining as a successful agent one 
who survives for one year and who, during 
that year, produces business in amount equal 
to, or more than, that of the average survivor 
in his company.* Such an agent falls in Quad- 
rant I. All other agents (those in Quadrants 
II, IlI, and IV) have been classified tradi- 

*In his more recent work, in contrast with the ear- 
lier from which the above was taken, Peterson has 
described a successful agent as one who falls within 
the top quarter of a standardized joint distribution 
of first-year production and survival. This makes 
the proportion of successes the same in all com- 
panies. Via the definition given in the text, propor- 
tion ol irom one company to 
another. 
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and termination/survival 


tionally, by Peterson, as failures. Thus, results 
a continuum labeled AI showing whether an 
applicant is (a) more like agents in Quadrant 
I, or (6) more like agents in the remaining 
three quadrants, the latter being treated, of 
course, as an undifferentiated unit. 

By definition and, because of the method 
used to construct its predictor, the Aptitude 
Index, this continuum is related. necessarily 
both to production (y axis) and to termina- 
tion/survival (x axis). Recalling, and making 
use of the fact that a correlation (geometri- 
cally considered) can be equated to the cosine 
of an angle, we find the relations between the 
Aptitude Index vector and each of our funda- 
mental reference criteria represented by the 
acute angles, a and £. a shows the relation of 
the Aptitude Index with production (along 
the vertical axis); 8, with termination/sur- 
vival (along the horizontal axis). 

Taking note of a criticism directed fre- 
quently against the foregoing definition of suc- 
cess, Sweeney has proposed (as have others) 
that agents in Quadrant IV be counted as 
successes. These are agents who, in spite of 
termination prior to one year in the business, 
produce, during their short stay with a com- 
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pany, at an above-average rate. The effect of 
this proposal is to shift the centers of gravity 
by which successful and unsuccessful agents 
can be represented: that for successes, from 
Point S left toward Point S’; that for failures, 
from Point F; downward to become Point Fo. 
Hence, the continuum along which Sweeney 
proposes measurement is F.—S’ rather than 
F,—S. But since the angle, 4, between vec- 
S and F.—S’ is small, their intercor- 
will be high. Therefore, the refine- 
ment which Sweeney proposes yields differ- 


tors F, 
relation 


entiation along a continuum not very different 
from that used traditionally by Peterson 


COMBINATION INVENTORY VECTORS 


Next, let’s look at Section 3 of the Com- 
bination Inventory. Containing one-half the 
items in Strong’s Vocational Interest Blank 
this could, of course, yield many 
scores. We have used however only the scale 
for life insurance interest. From this we have 
been able to predict termination/survival pro- 
vided production be high and held constant 
(Ferguson, 1958a). Thus, by Section 3 
provide measurement or differentiation along 
Vector CI—3. In other words we show, at the 
time of employment, the chances that an ap- 
plicant is apt to be more like agents in Quad- 
rant I (i.e., like high producing first-year sur- 
vivors) or more like those in Quadrant IV 
(i.e., like high producing first-year termina- 
tors). We do not measure, however, along the 
Parallel Vector CI—3’ to show differences be- 
tween agents in Quadrant II (i.e., low pro- 
ducing first-year survivors) and 
Quadrant III (i.e., low producing first-year 
terminators). In the latter case, termination 
can be due to lack of ability as well as to lack 
of interest. Therefore, interest has, at this 
low level of production, no predictive value. 

In Fig. 1, parallel to the Vectors CI—3 and 
are two other vectors: CI—3” and 
3/". These represent scales that could be 


constructed 


section 


we 


those in 


CI—3’, 
CI 


to contrast first-year terminators 
(a) with second-year survivors, and (8) with 
third-year survivors, respectively. Since these 
vectors are parallel their intercorrelation will 
be near unity 
to be little point in trying to among 
them except that, in terms of time, the CI—3 


. This being so, there would seem 


cnoose 


scale is that which first becomes available. 


Leonard W 


. Ferguson 


But to offset this, the Cl 3” scale might be 
more reliable, and it would have more differ- 
ential power. This comes from the fact that 
it spreads over, the range Ry + R, those ap- 
plicants who, on Vector CI—3, are confined to 
the range R,. In this latter group there may 
be agents who, while surviving for a year, will 
not survive for three years. On Vector CI—3 
they will have high scores, but on Vector 
CI—3’”, they will get low scores. 

As this point may seem abstruse, let’s con- 
sider another example. Strong (1955) once 
prepared two life insurance keys: one of these 
to differentiate between men-in-general and 
sold life insurance at a rate of 
),000 per year or more; a second to dif- 


agents who 
$106 
ferentiate between men-in-general and agents 
who sold life insurance at a rate of $200,000 
per year or more. He found that the latter 
scale was much more predictive than the first. 
Presumably, the $200,000 producer has, in 
life insurance interest, whatever is possessed 
by the $100,000 producer. This, and more. 
But $100,000 man may not 
have what it takes, in life insurance interest, 
to become a $200,000 producer. Hence, in the 
termination/survival, the man who 
survives three years must have, in life insur- 


even the good 


case of 


ance interest, whatever is necessary to cause 
survival through two years. But the two-year 
survivor may or may not have, in life insur- 
ance interest, what is needed for three years 
of survival. 
PRODUCTION EARNINGS 

Consider, next, in Fig. 1, Vectors P;, Ps, 
and P;. These represent production in the 
first-, second-, and third-year groups. With re- 
spect to these vectors, note several important 
things. First, they are parallel to each other, 
suggesting intercorrelations near unity. Thus 
except for the possibility of greater reliability 
and/or contamination by situational factors, 
Vector P, is just as useful as Vector Ps. But 
it is available at the end of the first year of 
any given study rather than at the end of the 
third. Second, note that 
late not at all with the interest vectors. (The 


cosine of a 9 


these vectors corre- 
angle is, of course, zero.) 
Third, note that the production vectors cor- 
relate with the AI (Aptitude Index) vector to 
about the same extent as do the interest vec- 
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tors. By no means is this surprising for, as ob- 
served, Peterson has included (traditionally) 
components of both production and survival 
in his definition of agent success.° But since, 
in the Combination Inventory, by means of 
its vocational interest section, we have been 
able to predict the termination /survival com- 
ponent of success independently of the pro- 
duction component are motivated, 
course, to predict production independently 
of termination/survival. If this is possible 
(and soon we shall show that it is) we shall 
have a basis for predicting any vector which 


we of 


has as its two chief components production 
and termination/survival. 

Before proceeding to the task of predicting 
production we shall need to tell how we de- 
fine it. As most readers will know, the opera- 
tional base for the combination agent is his 
debit. For his services in receiving premiums, 
he is paid a collection commission. And for 
his new production he is paid, of course, a 
business commission. These 
apply to all combination agents regardless of 
company. But right at this point comparabil- 
ity from one company to another stops. With- 
out reviewing the many company to company 
variations suffice it to say th 
them our procedure has been s+ 


new Statements 


necause of 


lows: 
From the total earnings paid to agent 
given calendar deduct his collection 
missions. The difference is called production earnings 
In the ideal case, this figure represents new business 
commissions only but sometimes of 
keeping procedures which preclude their isolation) it 
includes renewal commissions and other 
items. Once production earnings are computed, agents 
are divided into various groups according (a) 
of employment, (b) weeks worked during calendar 
year under consideration and, within of these 
variables, (c) collection commission. Within each ref 
erence group which those 
worked a given number of years, a 
weeks during a specified year, and having been paid 
a specified collection commission) agents are divided 
on the basis of production earnings, into three groups 
As close as proves empirically this is 
top, middle, and bottom thirds 

For most companies classification of 
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Economic MAT 


The studies to which we refer are several 


relative to the predictive value of the per 
sonal history section of the Combination In- 
ventory. The items involved, much like those 
Part I of the Aptitude reflect 
applicant’s educational, family, and occupa- 


on Index. an 


tional background and were divided for our 


purposes into various groups, only of 


which is of present concern. This is a set con- 


one 
taining items designated, collectively, as a 
measure of economic maturity. 

Briefly put, economic maturity is a measure 
of the extent to which an applicant has pro- 
gressed economically in relation to others of 
his same age group. It is based on six items 
privately 
living expenses 


life insurance, current 


r 
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and earnings: earnings last 


ar before that, and 
six years ago. Of all the sets of items into 


year, vear before last, ve 


which the personal history section of the Com- 
bination Inventory has been divided, this set 
has the th back- 
ground. To this background we need to give 


best formulated eoretical 
special attention 

the 
ye of item most useful in predicting success 


For many years it has been known that 
ty] 
in life insurance selling as, indeed, in many 
other lines of occupational endeavor has been 
that of a personal history or biographical na 
ture. But for long years nothing in partic 


ular 
than the ' 


seemed to come of this other ’ 
attempts to expand, by empirical trial and 
error, the items 

involved. And, as relatively modest success 
has attended these efforts, t] the 
(now) prevalent notion that selection 
research in general there need 1 


man 


number of which might b 
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Fic. 2. The interrelations among economic maturity, 
net worth, and age. 


personal pleasure, as well as professional 
pride, to report two significant advances. 

The first of these was made by Peterson 
when he utilized a fact long known but little 
applied in selection research. This was the 
fact that many of the personal history items 
found to have predictive value are those which 
are related to age—life insurance ownership 
being one obvious example. Reasoned Peter- 
son—individuals at, say, age 40 will, in ordi- 
nary circumstances, own more life insurance 
than individuals at, say, age 25. Therefore, 
before one evaluates life insurance ownership 
for its predictive value, one should make < 
correction for its relation to To make 
this possible Peterson developed life insurance 
ownership norms for each of several age 
groups. This made it possible to give a score 
for life insurance ownership showing whether 
an applicant is above, or below, or equivalent 
to, the average for his own age group; rather 
than above, or below, or equivalent to the 
average of a group of individuals regardless 
of age. Peterson did this likewise for each 
of several other items: net worth; minimum 
current living expenses; employment status; 
length of time employed; number of depend- 
ents, etc. 

All these items (and others) corrected for 
age, and validated individually against the 
AI (Aptitude Index) continuum previously 


age. 


defined, are now to be found as useful (i.e., 
as predictive) items in the current Aptitude 
Index. As Peterson has publicized his research 
too little, his significant contribution to selec- 
tion technique has not had the popularity it 
truly deserves. 

The second advance, and the only one 
which the author will claim as his own, is 
that which has led to the concept, economic 
maturity (Ferguson, 1958b). Viewing Peter- 
son’s work on the age factor, as well as other 
studies on personal history items, the author 
observed that many of the items which had 
predictive value for selection seemed to have, 
in addition to their age relations, some tinge 
of economic significance, i.e., they concerned 
money, witness: life insurance ownership, cur- 
rent living expenses, indebtedness, etc. More 
specifically than this, they might be related to 
some overall index of economic value such as 
net worth, 

So, beginning where Peterson left off, the 
author plotted, as in Fig. 2, net worth as a 
function of age. Observe that the relationship, 
expressed by the cosine of the angle, 3, is posi- 
tive. Applying a Peterson-type age correction 
we obtain a factor which, while correlated 
positively with net worth, is uncorrelated with 
age. This factor, represented by the Vector 
EM, is defined to be economic maturity. 

Recall, now, that net worth at any given 
point in life represents the algebraic sum- 
mation of all one’s accumulated income and 
expenses. Therefore, one could derive a second 
measure of economic maturity by considering 
individual items of income and expense. By 
this second method we developed the scale 


TABLE 1 


PRODUCTION EARNINGS IN RELATION 
TO Economic MATURITY 


Economic Maturity 


BA \ 
or o7 


Production Earnings 0 


Top Third 
Middle Third 
Bottom Third 


Total 


1510 x? 
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which started this discussion: that measure 


of economic maturity based on one’s position 


within his own age group relative to privately 
purchased life insurance, current living ex- 
penses, and earnings. It is this second meas- 
ure of economic maturity that, at the moment, 
we are utilizing to predict production of the 
debit agent. 


PREDICTION 


Look, next, at Table 1. There it is made 
evident that the higher the score on economic 
maturity, the greater the percentage of agents 
whose production earnings are above average: 
15%, 20%, and 26%, as economic maturity 
changes from below, to above average. Con- 
versely, the lower the score on economic ma- 
turity, the greater the percentage of appli- 
cants whose production earnings are below 
average: 21%, 26%, and 29% as economic 
maturity changes from above, to below, av- 
erage. 

We have now demonstrated three things: 
(a) that economic maturity (i.e., a group of 
personal history items) predicts production 
earnings. More generally, we have predicted 
performance, from which the presence of abil- 
ity is usually inferred; (5) that interest pre- 
dicts termination/survival, from which some 
measure of satisfaction/dissatisfaction is many 
times inferred; and (c) that a test which looks 
very much as if it were a measure of aptitude 
predicts a criterion of success having as its 
two chief components performance (ability) 
and termination/survival (interest). Hence 
arises the suggestion that aptitude be con- 
sidered a joint function of ability and interest. 

If this deduction has merit it would appear, 
from Fig. 1 (if our variables have been drawn 
to proper scale), that at least one other de- 
duction could also be made. This would be to 
the effect that prediction of success over a 
relatively long period of time will depend 
much more on interest than on ability. Con- 
versely, prediction of success over a relatively 
short period of time will depend much more 


on ability than on interest. Thus, aptitude, if 
it really be a joint function of interest and 
ability, is itself a function of the time inter- 
val over which success is to be determined. 

To clarify this matter, compare Vector 
F,—S with Vector F;—S’’. The latter con- 
trasts (a) three-year survivors who are high 
producers with (5) first-year terminators who 
are poor producers. The former contrasts first- 
year survivors who are high producers with 
all other first-year agents. Note the angles a; 
and £,, and compare them with the angles a 
and f. 8;, being smaller than 8, possesses a 
larger cosine and shows that Vector F;—S’” 
is more highly correlated with the interest 
vector, CI—3’”, than is Vector F,—S. a, be- 
ing less than a;, possesses a larger cosine and 
shows that Vector F,;—S is more highly cor- 
related with the ability (performance) vec- 
tor, P;—S (or P;—S’’) than is the Vector 
F;—S”". 

Reflection will now show (again, on the as- 
sumption of proper relative scaling) that, in 
predicting success, the greater the time in- 
terval involved, the more important becomes 
interest and the less important becomes abil- 
ity. We have, at least, this Avpothesis. Made 
explicit, we further 
critical test. 


can now subject it to 
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A psychological test or inventory is usually 
developed for use within a specified popula- 
Commonly, all reliability, 
and validity data have been derived from 
samples of that particular population. When 
the test is applied to another population with- 
out further validation studies the psychologist 
least implicitly, that the two 
populations are equivalent and the test will 


tion. normative, 


assumes, at 
thus be equally as valid in the new population. 

This untested assumption of validity gen- 
eralization is a frequent occurrence in spite 
of repeated warnings (e.g., Anastasi, 1950; 
Cureton, 1950; Mosier, Cureton, Katzell, & 
Wheery, 1951; APA, 1954). Especially when 
the group to which the test is applied differs 
markedly from the original validation group, 
further evidence is needed concerning the ef- 
fectiveness of the test. 

The Minnesota Counseling Inventory (MCI) 
constructed, and _cross-vali- 
dated using high school students as Ss ( Berdie 
& Layton, 1957). No evidence is available on 
settings. It been 
that there are significant dif- 


was validated, 


its validity in other has 
shown, though, 
ferences in the mean scores of high school and 
college students (Brown, 1958). 

The purpose of this study was to investi- 
gate the validity of the MCI within a popu- 


lation of liberal arts college freshmen. 


PROCEDURI 


The Test. Tt 
pencil personality it 


. y } from tk 
were derived from the 


paper and- 
It consists of nine scales 
I Minnesota Multiphasic 
Inventory (MMPI) and the Minnesota 
MPS). The first pro 
an indication of validity of the test profile and 
l ? and L the MMPI. The 
deriv the MPS and 
with areas of maladjustment—Family 
FR), Social Relationships (SR), and 
ES Th 


structured 
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three 


concerned 
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scaies were d trom 


Emotional Stability final four are based 


i from an unpublished 
dissertation submitted to the Gr School of the 
Minnesota in partial fulfillment of the 
equirements for the degree of Doctor of Philosophy 


This article is based on dat 
aduate 


University of 
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scale, Reality 
(Le) from 
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Conformity 
the S« 


They are Mood (M) 
(C) from the Pd 
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Subjects 
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irom 


The total sample consisted of 1809 fresh- 
women were students 
in five Minnesota liberal arts colleges. Three colleges 
coeducational and affiliated with Protestant 
denominations. Two were Catholic—one for men and 
the other for These colleges range in 
1800 students. The coverage of students 
was virtually Only the students who did 
not take the MCI as a part of freshman orientation 
students 
the 
included 


877 men and 932 who 


were 
women size 
from 900 to 
complete 
who for various reasons 
and three 
total attrition 


programs, a 
did not 
students were 
less than 5% 

Raters. The raters 54 men and 49 women, all 
of whom proctors 
Most were upperclassmen; in one women’s dormi- 
tory they were housemothers; in the Catholic schools 
teaching Brothers and Sisters. All had 
lived in the dormitory with the students for at least 
seven months before the ratings were made 

Rating Procedure. The rating form was based on 
one developed by Berdie and Layton (1957). It con 
sisted of one rating scale for each of the seven diag- 
nostic the MCI. The rating form descrip 
tions were stated in behavioral terms but attempted 
to measure the psychological traits underlying the 
scale. (For a description of the rating form content 
see the MCI manual, pp. 10-12.) * The rater read the 
description and nominated the students who best fit 
the No ratings made for the va- 
lidity scores 

Statis 


lew 
dormitories, foreign 


The 


live in 
not was 
were 


were dormitory counselors or 


some were 


scales on 


desc ription were 
As the 
several colleges were quite small, the data were com- 
bined over all colleges for each This proceduri 
assumed that the MCI is equally valid at all colleges 
(Brown, 1958) that this 
be untenable, but small Ns at several 
definite Thus the 
groups were combined to obtain a more stable esti- 


tical Analyses nominated groups at 


sex 


data suggested as- 
sumption may 


colleges prohibited a 


some 
conclusion 
mate of mean differences 

Four of the rating scales (SR, ES, M, C) included 
descriptions of both the “good” and “poor” ends of 
the same The nomination procedure thus re- 
sulted in three groups of students: those nominated 
as fitting the good description (NG), those nomi- 
nated as fitting the poor description (NP), and the 
students who not nominated on that scale 
(NN). Three comparisons of mean differences were 


scale 


were 


Copies of the rating available on re- 


quest from the author 


lorm are 
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nominated good vs. not nominated (NG-NN), 
nominated poor vs. not nominated (NP-NN), and 
nominated good vs. nominated poor (NG-NP); and 
critical ratios computed 

On the other three scales there were only two 
groups: nominated good (on Le) or nominated poor 
(on FR and R) and the not nominated groups. Criti 
cal ratios computed between the means of the nomi 
nated and not nominated groups 

As students were nominated as fitting either a 
good or poor description, high scores on the MCI 
are in the poor direction, and the direction of the 
differences was hypothesized in advance, a one-tailed 
test of significance was used. 


made 


RESULTS 


The results of the analyses are presented in 
Tables 1 and 2. 

Family Relationships. An extremely small 
number of students were nominated as having 
family problems. Although the raters were in- 
structed to nominate 10-15% of their stu- 


dents, only about 3°% were nominated—25 
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men and 27 women. Either few college stu- 
dents had family problems or the counselors 
were unable to identify which students were 
having problems involving family relation- 
ships. The latter is probably the more plau- 
sible alternative. 

The Ss nominated as having family prob- 
lems, however, scored distinctly higher on the 
FR scale than did the other students. Both 
critical ratios were significant at the .01 level 
and mean differences were large—at least 4.5 
raw score points. 

Social Relationships. Students nominated 
as having good social relationships differed 
clearly from the not nominated groups. Both 
critical ratios were significant at the .01 
level of confidence and mean differences were 
greater than 5.5 raw score pdints. These criti- 
cal ratios were larger than for any other com- 
parison. 

The nominated good groups were also dis- 


TABLE 1 


CRITICAL 


Nom 
Good 


Not 
Nom 


Fam. Rel 


Emot. Stab. 


Mood 


Conf 


Reality 


* Significant at the .05 level 
** Significant at the .01 level. 


RATIOS BETWEEN NOMINATED 


AND Not NoMINATED GROUPS OF 


Nom 


Poor 


Mea 
Diff 
25 4.6 
11.4 

8.0 

53 

20.5 

14.7 


58 


4.60** 
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TABLE 2 


RATIOS BETWEEN NOMINATED 


Nom 


Good 


No 
Nom. 


905 


Soc. Rel. 


. Stab. 


Conf. 


tinguished from the nominated poor groups; 
both critical ratios being significant at the .01 
level. Again mean differences were large 
being 9.6 points for men and 12.7 for women. 
The contrast between the nominated poor 
and not nominated groups was not as Clear. 
The critical ratio for men was significant at 
the .05 level and for women at the .01 level. 
However, mean differences were sizable—4.1 
points for men and 6.4 for women. 
Emotional Stability. In the high school 
validation of the ES scale, the extreme groups 
differed from each other but did not differ 
from high school students in general. In the 
present study the differentiation between 
women’s groups were much better than be- 
tween men’s groups. All the critical ratios 
computed on women were significant at the 
.01 level. For men, two of the three critical 
ratios were significant at the .05 level with the 
nominated good vs. not nominated ratio be- 


\N 


» Nor NoMINA 


Nom 


Poor 


nonsignificant. In all differ- 
ences were small. 

Mood. The M scale also differentiated be- 
tween women better than be! men. The 
two critical ratios that were significant both 
involved women. They also involved the nomi- 


nated poor group, i.e., 


ing cases mean 


veen 


nominated vs 
nominated good and nominated poor vs. not 
nominated. Thus, all significant differences on 
the M scale were between the women nomi- 


poor 


nated poor and other groups of women. 
Conformity. The C scale significantly dif- 
ferentiated the good groups from the poor and 
the poor from the not nominated. Neither of 
the critical ratios between the nominated good 
and not nominated groups were significant. 
This failure to discriminate is probably not 
crucial. In any situation where responsible 
students are to be selected the criterion of 
selection would certainly be other than test 
scores. The counselor and Dean, however, are 





Validity of the Minnesota Counseling Inventory 


interested in the student who is, or potentially 
is, nonconforming and irresponsible. The C 
scale identified these students 

Reality. Reality was the only scale with no 
significant critical ratios. In fact, for women 
the mean of the nominated poor group was 
lower than that of the not nominated group, 
i.e., the differences were opposite to the pre- 
dicted direction. This scale also lacked va- 
lidity with high school students. Either the 
scale is ineffective, the raters are unable to 
identify students fitting the behavioral de- 
scription, or the scale measures some factor 
other than that hypothesized in the descrip- 
tion. 

Leadership. The Le scale identified those 
students who are regarded as leaders. Both 
critical ratios were significant at the .01 level 
although the mean differences were not large. 

Thus while Berdie and Layton found six of 
the seven scales effective with high school stu- 
dents, only four scales (FR, SR, C, and Le) 
gave consistently significant results with col- 
lege students. Two other scales (ES and M) 
were effective in discriminating between groups 
of women but not between groups of men. 
Only the R scale gave no indication of being 
valid. 

Even when the critical ratios were signifi- 
cant, the overlap of raw scores between groups 
was great. Within each group there were in- 
dividual scores at all levels. Thus a high or 
low score should not be considered as defi- 
nitely indicating that the student possesses 
the characteristics of the high or low scor- 
ing group. Rather, it represents an increased 
probability that he does possess these charac- 
teristics. 


DISCUSSION 


The design of the present study fits the 
paradigm of construct validity. What is being 
investigated are the psychological factors ac- 
counting for the performance on the inven- 
tory. Essentially both the inventory and the 
underlying constructs, operationally the rat- 
ing form descriptions. are being validated 
simultaneously. This simultaneous validation 
has definite implications for the interpreta- 
tion of the results of the study. It is possible 
that the MCI is “more valid” than the cri- 
terion (ratings). The imperfect differentiation 


would then be the result of a weak criterion 
rather than poor MCI scales 

The reliability of the MCI scales also is a 
factor in interpreting the results. As the rat- 
ings were made 7—8 months after the inven- 
tories were administered and the reliability of 
the scales is less than perfect, the scores used 
differed somewhat from the ones that would 
have been obtained if the test were repeated 
at the time the ratings were made. Much of 
this change would probably reflect actual dif- 
ferences in personality structure produced by 
college and related experiences. However, due 
to economic and time pressures, personnel 
workers often must rely on results of tests 
given several months previously. 

The nomination procedure used, as is typi- 
cal of ratings, was subject to many biasing 
factors. Because of the design of the study 
no quantitative evaluation of these effects was 
possible. The large number of raters used, 
however, probably served to counterbalance 
some of the biasing factors present in indi- 
vidual raters. 

Sex differences in the significance of the re 
sults were also present. More significant dif- 
ferences were obtained with the women’s than 
with the men’s samples (x 5.00; p< .05). 
\s there is no reason to assume that the MCI 
is more valid for women, the differences must 
be attributable to 
Taft, 1955) 


rater characteristics (cf. 


SUMMARY 


liberal arts 
the be- 
havioral descriptions on a modified version of 
Berdie and Layton’s rating form. A total of 
1809 college freshmen, 877 men and 932 
constituted the population being 
rated. Critical ratios were computed between 


counselors at five 


colleges nominated 


Dormitory 
students who fit 


women. 


the means of the various nominated groups 


within scale and sex groupings. 
The following conclusions were indicated: 


1. Four MCI scales, FR, SR, C, and Le, 
differentiated between groups and should be 
most useful to college counselors. 

2. Two scales, ES and M, were effective 
for women but not for men. 

3. Only the R scale gave no indication 
of being valid. 
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In all cases extensive overlap of raw scores 
between groups was present, indicating the 
need for caution in interpreting individual 
scores. 
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